00C0HBH7 SBSOiB 



TH 00« 326 

Ei^pectea yalae o€ a Sasple Bstiiate. 
Statistical Repotting Service (DOi) , Washington, 

SBS-19 
Sep 74 
n6p. 

HF-$0.76 HC-$6,97 PtOS POSTAGE 
♦igriculture; Algebra; Classification; Data 
Collection; Design; 4>HatheMtical Applications; 
Hatheiatics; Matrices; Probability; ♦Probability 
Theory; *Saapling; Set Theory; *Statistical Analysis; 
Statistical Bias; Statistical Surveys 
KandoB Variables; Variance (Statistical) 



Intended as a reference for the convenience of 
students in saspling, this aonograph atteapts to express relevant^ 
introductory aathematics and probability in the context of sample 
surveys. Although sose proofs are presented, the eaphasis is »ore on 
.^^.i»ositlon of satheaatical language and concepts than on the 
■ttthesatics per se and rigorous proofs, Hany probleas are given as 
exercises so a student say test his interpretation or understanding, 
of the concepts. Host of the aathenatics is elesentary. Bach chapter 
begins with sixple explanations and ends at a such sore advanced 
level. Students with only high school algebra should have no 
difficulty with the first parts of each chapter. Chapters 1 and 2 
were added as background for Chapter 3 which discusses expected 
values of randoa variables. Chapter 4 focuses attention on the 
distribution of an estiaate which is the basis for coaparing the 
accuracy of alternative sampling plans as well as a basis for 
stateaents about the accuracy of an estiaate froa a saaple. The 
content of chapter 4 is included in books on saapling, but it is 
important that students hear or read «ore than one discussion of the 
distribution of an estiaate^ especially with reference to estiaates 
froa actual saaple su?^veys. (Author/BC) 



BD lO:^ 480 
TITLE 

IBSTITOTIOB 

BBPOBT BO 
POB DATE 
BOTE 

EDRS PRICE 
DESCRIPTORS 



IDEHTIFIERS 
ABl^TRACT 



o 

Cx. 
f * • 



0^ 



4 - 


OCT 1 6 




1974 



Expected 
Value of a 
Sample 
Estimate 



Statistical Repating Service • US Department of AgicultmB>SRSNQl9 



'Z-i-^M'l.&W^^ i 5' \- c l£) Id 



3 

R 180 



■i lu* 6 2;<^. 14 1..^. : :i p: 
'■V 2fWt:--2.P2:Qi s^i a"; Hi lF ic ."c 



.■ZR3:31 91 B 
';23 2iB,I 5l 7 . 

■.aA:oAoao? ' 

• 2"A2Qa-i-^' 
-^Zf 2 /^1715. . 
'•v2f 2iVl6l5 

^•■2:^-2A 1.7 "15 

V A"7f- ^ A l '~ 

■ •■' 1. ^> 

V c\ i.J I 3, 
.'■•2'<t.2.o.i- '^.r- 

: 2;F£:5 i.r l ^ 

--:;2'B2'^T K i. 9 " 

■ ■■2 ' 8 2 ^:1 5 l t>- 
7^>SZp.l!^'i 7 

' •'2 F:af7l7l7k 
i;:i;fr:2-f'iy 18. 



:2;B>^4-22 ^F7:^f .2D■l:9 18^ 2;i 2 liC IC ^^26 I 
'a7:27'2525 2 F 2f^;i9:i 8' ■ -2 11 F K-lS 3^11 
illf'L.o'l-ft - 3 l3^ iA IC IF2 1.1 aK7 31.2F ]; 
lClM"2U:2352aiFI6 UlEl^l'y 17.1- /( 
1 Fl F LI i I ■ c ftO 8 a2 G 2 ■ 1 f 1 F L l' 1 - l ^ 'VH'; B ? 
21 IF'l.Oi a 2B2ai7L5" U. IF 11. OF - IvO^K 
iFlFQFOF ■OA0AO2.O2 .U--1 01 1,CF OBOi'^ 
.lF-2 4 I B I c 2 li2 F 16 18 2 i> 2' I 2 2 I C ■ 2 F ? I > 1 
IF IF 17 1 A 2D 2 8.191-7 if-2/vlAin 

1 CLF-17i7' 2iv2.B1616 lFi.Fi(J.7 2H?A4 

2 i 7 1- 2 -3:2 26.2 6 1 2 1 3 . 1 ^ 2 1,IU2 I F 7 o 
2 5 .2.e 2 37 A . ■ 3 3 3 A i^i.i .[) 7 C 7 37 A 2 2 . .7A i 

■■' 1, F , 2 r I b ' C ^ -2 c;- 2 & 1. 4 1 3 2 i I ?" -1 F I C 2 A •/ i ? 
21 241^2 272!7l6 r'"' 2^ 2F7 ;7n - si vS ' 
70 2727 2 2 7^3 1 I" I' 2. 2 2 3:3- 

' 2 i ■ i r 10 1 7 ■ / 2 a L ■> 1, •/ 1. (. 2 -4 I 7 t i- 2 A 2 fV-j 
-4 ?9 lf^ >>.: 7r}33 T5 i <^ 21.2/2^2.2 2 f -■■v 
, 4' : A 2 2 2 3 3 : 2 F I w ). 2 i 7 I 1. i^M t 23- 2 : ' ! 

• 2 IF. 17 LC :2- .2F I 7 19" 2 i 7-LlC IC Zv2{>. 

■ 1 2 1 iD.-U) 2. 2A12 17' 1 f^];7. A • 2A 2fS ' 

. 1 IC 1-2' 2 2^ ^.A I :h.7 2! bi g 2 i'l 2f^2i2: 
•3i!3 1412' .7';2oi A 16. 7p IC 17 14 2 3/2 
..F 2.1 I t' in 2- 2 f' 1 7 i 9 I F 2 !^ 1 C . 2 F 2 1 7 
2 IIF i;C lA .r 2F FF 79 - IF 71 IC i 0 - 2* 2'- ^ 
•l-F2linrC ■ 2f'2F^i..6llv l -l F.I-51 3 . 2r. 2:'i 

. IF. { f i 'U. ' 20 2 A U i^i U i. F 1 R 1 C , ^i" 2 ' 
1^27 IC • 2F 2Fni7 l Bl 7v 71 IF IGI A 21) 2^' 
>121 )■ 22" 2B2D W16 23242-3 I F' 2r ■>-'■■ 
21? .lF7f- 7627.l\l 3 . 23242 j22 2-2t:.- 
. - . 2 1 2 2 2 2' 2 A ? S 131 4 2 .1 77 Vi-22.; 2 a 2 A 
2 I 2:4 1 F- [ f; ■ 2 8 2o- i 3 1 3 2 I i F [ f 10 2 ' i. 

. J vj;2.^'22 22 . '3 131 lA I'A ■2 424 2 ^2 3-. 3 .7M 
2''4 2 41 F 1 D ^ 2F 2 F IM 7«' 212 11 01 C Z\ . 

■:2i 24 112 in •^.12F3.3!7 2 F2 A2I D I 0 727 2 ■. 

r 2 4:2-l4M{V:2n2^9l?:i r 2 i;? l LF I C: \272} 

' : 2 C 2 4 2F 2- 2 ■ 3 f72 A i: L) 1 6 ' 2 I 2 4 1: F 2 2 ■ 2 2 

:.24 29in 2« 3-1 27 18,1 T 2 ? 2 l'2 i IC . 3 LA I ! 
2 ■';^i2t-fjf A^ v2.:fft it >. 2 i:2i.;l "T^l-^if^g^ ■ 

' iF'.] f^'Vs \ f ■•;^i7,2F 1 9-1 3 ■ [F 1 F i 6 1 iv^m^ 
- i-r rtl^'t 1 1- 21^2-6-l:'B 13 It 17; I € -f^;2}.! 
1 P I F'i a 1 7 , 2 6 2 2 F.5 1 2 . IF 12t I 7 i 8.;'2 2 1 F ^ 

'■ 3,7 IF 1 Air .„ ■ lUlB OC.OO- IF A l A. i {'•]. 



} f- .>'r 



■ r 2 ■' ' ■ 
;77"7 



2 .! 7 1 1 1-2 2., « / 
2 (" I F l-Flo' 2 7. 'V 
'24242 ^23-. 3 .7M 



F0REW3RD 

The Statistical Reporting Service (SRS) has been engaged for 
many years in the training of agricultural statisticians from around 
the world. Most of these participants come under the support of the 
Agency for International Development (AID) training programs; however, 
many also come under sponsorship of the Food and Agriculture Organization 
Into the International Statistical Programs Center of the Bureau of the 
Census, with which SRS is cooperating* 

This treatise was developed by the SRS with the cooperation of 
AID and the Center, in an effort to provide improved materials for 
teaching and reference in the area of agricultural statistics, not 
only for foreign students but also for development of staff working 
for these agencies . 

HARRY C- TRELOGAN 
Administrator 

Statistical Reporting Service 



Washington, D* C. 



September 1974 



PREFACE 



The author has felt that applied courses in sampling should give more 
attention to elementary theory of expected values of a random variable. 
The theory pertaining to a random variable and to functions of random 
variables is the foundation for probability sampling. Interpretations 
of the accuracy of estimates from probability sample surveys are predicated 
on, among other things, the theory of expected values. 

There are many students with career interests in surveys and the 
application of probability sampling who have very limited backgrounds In 
mathematics and statistics. Training in sampling should go beyond aimply 
learning about sample designs in a descriptive manner. The foundations 
in mathematics and probability should be included. It can (1) add much 
to the breadth of understanding of bias, random sampling error, components 
of error, and other technical concepts; (2) enhance one's ability to make 
practical adaptations of sampling principals and correct use of formulas; 
and (3) make communication with mathematical statisticians easier and more 
meaningful. 

This monograph is intended as a reference for the convenience of 
students ,^n sampling. It attempts to express relevant, introductory 
mathematics and probability in the context of sample surveys. Although 
some proofs are presented, the emphasis is more on exposition of mathe- 
matical language and concepts than. on the mathematics per se and rigorous 
proofs. Many problems are given aS exercises so a student may test his 
interpretation or understanding of the concepts. Most of the Mathematics 
is elementary. If a formula looks involved, it is probably because it 
represents a long sequence of arithmetic operations. 



/ • ■ 

Each chapter begins with very simple /explanations and ends at a much 
more advanced level. Host stud(*nt8 with only high school algebra should 
have no difficulty with the first parta oi each chapter. Students with a 
few courses in college mathematics and statistics might review the first 
parts of each chapter and spend considerable time studying the latter parts 
In fact, some students might prefer to start with Chapter III and refer to 
Chapters I and II only as needed. 

Discussion of expected values of random variables, as in Chapter III, 
was the original purpose of this monograph. Chapters I and II were added 
as background for Chapter III. Chapter IV focuses attention on the dis- 
tribution of an estimate which is Che basis for comparing the accuracy 
of alternative sampling plans as well as a basis for statements about the 
accuracy of an estimate from a sample. The content of Chapter IV is 
Included in books on sampling, but 1,t is important that students hear or 
read more than one discussion of the distribution of an estimate, espe- 
dally with reference to estimates from actual sample surveys. 

The author's interest and experience in training has been primarily 
with persons who had begun careers in agricultural surveys. I appreciate 
the opportunity, which the Stati itical Reporting Service has provided, to 
prepare this monograph. 

Earl E. Houseman 

Statistician 



lii 



( 

\ 



CONTENTS 



Page 



Chapter I. Notation Bnd Suomation I 

1.1 Introduction 1 

1.2 Notation and the Syobol for SusoBuitijn 1 

1.3 Frequency Distributions 9 

1.4 Algebra 10 

1.5 Double Indexes and Suiamation 14 

1.5.1 Cross Classification 15 

1.5.2 Hierarchal ^ r Nested Classification 22 

1.6 The Square of a Sum 26 

1.7 SuiBS of Squares 29 

1.7.1 Nested Classification 29 

1.7.2 Cross Classification 32 



Chapter II. Random Variables and Probability 33 

2.1 Random Variables 33 

2.2 Addition of Probabilities 35 

2.3 Multiplication of Probabilities . 41 

2.4 Sampling With Replacement 42 

2.5 Sampling Without Replacement 45 

2.6 Simple Random Samples - 47 

2.7 Some Examples of Restricted Random Sampling 50 

2.8 Two^Stage Sampling 57 

Chapter III. Expected Values of Random Variables 63 

3.1 Introduction 63 

3.2 Expected Value of the Sun of Two Random Variables 67 

3.3 Expected Value of an Estimate 72 

3.4 Variance of a Random Variable 77 

3.4.1 Variance of the Sum of TWo Independent Random 

Variables 77 

3.4.2 Variance of the Sum of TVo Dependent Random 

Variables ^ 79 

Iv 



ERIC 



CONTENTS (Continued) 



3.5 Variance of an Bstlmace 

3.5.1 Equal Probability of Selection 

3.5.2 Unequal Probability of Selection 

3.6 Variance of a Linear Condalnatlon 

3.7 Estlnation of Variance 
3*7.1 Sltsple Random Stapling 

3.7.2 Uneq\ial Probability of Selection 

3.8 Ratio of Ttfu Random Variables 

3.9 Conditional Expectation 

3.10 Conditional Variance 

Chapter IV. The Distribution of an Estimate 
A.l Properties of Simple Random Samples 
A. 2 Shape of the Sampling Distribution 

4.3 Sample Design 

4.4 Response Error 

4.5 Bias and Standard Error 



1 



aiAPTER I. NOTATION AND SUMMATION 

1.1 INTRODUCTION 

To work with large amounts of data, an appropriate systctn of notation 
Is needed. The notation must identify data by Individual elements, and 
provide meaningful mathematical expressions for a wide variety of summaries 
from Individual datA* This chapter describes notation and Introduces 
summation algebra, primarily with reference to data fro nsus and sample 
surveys. The purpose Is to acquaint students with notation and summation 
rather than to present statistical concepts. Initially some of the expres- 
sions might seem complex or abstract, but nothing more than sequences of 
operations involving addition, subtraction, multiplication, and dlvleion 
Is Involved. Exercises are included so a student may test his interpreta- 
tion of different mathematical expressions. Algebraic manipulations are 
also discussed and some algebraic exercises are included. To a consider- 
able degree, this chapter could be regarded as a manual of exercises for 
students who are Interested in sampling but are not fully familiar with 
the suninatlon symbol, £. Familiarity with the mathematical language will 
make the study of sampling much easier* 

1.2 NOTATION AMD THE SYMBOL FOR SUMMATION 

"Element" will be used in this monograph as a general expression for 
a unit that a measurement pertains to. An element might be a farm, a per- 
son, a school, a stalk of com. or an animal. Sudi units are sometimes 
called units of observation or reporting units. Generally, there are 
several characteristics or items of Information about an element that one 
right be Interested in. 



ERIC 



S 



6 



'*fteaBurem«nt'* or ^Value^* will be used as general ten» for the 
rtutaerlcal value of a specified characteristic for an elei^nt* This 
Includes assigned values. For examiplei the element might be a farm and 
the characteristic could be whether wheat is being grown or is not being 
grown on a fam. A value of '*1'* could be assigned to a farm growing wheat 
and a value of "0*' to a farm not growing wheat. Thus, the ^'measurement'* 
or "value" for a farm growing wheat would be "1'* and for a farm not grow- 
ing wheat the value would be "0." 

Typically » a set of measurements of N elements will be expressed as 
follows: , X^y.-yX^ where X refers to the characteristic that is 
measured and the index (subscript) to the various elements of the popular 
tlon (or set). For example , if there are N persons and the characteristic 
X is a person's height, then X^ is the hitight of the first person, etc. 



To refer to any one of elements, not a specific element, a subscript "i" 
is used. ThuBt (read X sub 1) means the value of X for any one of the 
N elements. A common expression would be "X^ is the value of X for the 
i'^ element." 

The Greek letter Z (capital sigma) is generally used to indicate a 
sum. When found in an equation, it means "the sum of." For example, 
N 

Z X. represents the sum of all values of X from X. to X.^; that is, 
N 

E X. «» X. + +...+ X^. Tne lower and upner limits of the index of 
i«»l 

sutnmation are shown below and above the suitanation sign. For example, to 

20 

specify the sum of X for elements 11 thru 20 one would wriu*" Z X. . 

i-ir 

1) 



ERIC 



3 



ERIC 



You might also see notation such as "SX. where i " 1, 2,.,., which 
indicates there are N elements (or values) in the set indexed by serial 
nuinbers 1 thru N, or for part of a set you niRht see"EX^ where i - 11, 
12, ...,-20." Generally the index of summation starts with 1; so you will 

N 

often see a summation written as SX^. That is, only the upner limit of 
the summation is shown and it is undi^rstood that the sumnation begins with 
l-l. Alternatively, when the set of values being summed is clearly under- 
stood, the lower and upper limits night not be shown. Thus, It is under- 
stood r.hat SX or is the sum of X over all values of the set under 
i ^ 

consideration. Sometimes a writer will even drop the subscript and use 
rx for the sum of all values of X. Usuallv the simplest notation that is 
adequate for the purpose is adopted. In this raonor.raph, there will be 
some deliberate variation in notation to familiarize students with various 
representations of data. 

An average is usually indicated by a "bar" over the symbol. For 
example, .X (read "X bar," or sometimes "bar X") means the average value of 
N 

Z X^ 

X. Thus. X - — . In this case, showing the upper limit, N, of the sum- 

N 

mation makes it clear that the sum is beinp. divided by the number of elements 

EX^ 

and X is the average of all elements. However, — would also be inter- 
preted as the average of all values of X unless there is an indication to 
the contrary. 

Do n ot tx2.J;o__8tud;t_math^^ \-n;enever 
the shorthand is not clear, try writing, it out in long form. This will 
often reduce any amblRulty and save tine. 

A) 



Here ate aotae exarapikfl of mathematical shorthands 



(1) Sum of the reclprp^als of X 



? 1 . 1 . i ^ ^1 



(2) Sum of the cTif ferences between 



X^ and a constant, C 



N 



^S^(X^-C>«(X^-C)+(X2-C)+. . .+(Xjj-C) 



(3) Sum of the deviations of X 
fron the average of X 

(4) Sura of the absolute values df 
the differences between X 
and X. (Absolute value, 
indicated by the vertical 
lines, means the positive 
value of the difference) 

(5) Sum of the squares of X. 



(6) Sum of squares of the 
deviations of X from X 



S(X^-X)«(X^-X)+(X2-X)+...'»-(Xjj-X) 



E j X^-X| -I X^-X{+i Xj-Xk. . .-•■IXj^'-Xl 



^^1 -^1^^2^^3*'" 4 

r(x^-x)^ - (Xj-x)^ +...+ (Xj^-x)^ 



(7) Average of the squares of the 
deviations of X from X 



(8) Sum of products of X and Y 



j!!^^^'"^^ (X^-X)2+...+(Xj^-X)2 



N 



N 



N 

EXY - XY+XY+ •♦•XY 

^^^A^T^ *x^r'^2^2 • • • Vn 



(9) Sum of quotients of X 
divided by Y 



1 \ ^2 



(10) Sum of X divided by the 
sura of Y 



SX^ X^+X2+...+ Xj^ 



(11) Sum of the first N digits 



N 

2 1- 1+2+3+. . .-f N 
1-1 



(12) 



j,^l - V^V^V--* NX^ 



O (13) 

ERIC 



r (-i)^x. - -x,+x,-x.-hx,-x>x, 

1 1 2 3 4 5 6 



ERIC 



Exercise l.l . tou are given a set of four clencnts having the 

following values of X: - 2. - 0. X3 - 5. X^ - 7. To test your 
understanding of the summation notation/ compute the values of the follow- 
ing algebraic expressions: 

Express ion ^^^^^ 
4 

(1) t (X.-K) 30 
i»l 

(2) sacx^-i) 20 

(3) aux^-i) 20 

(4) S2X^-1 ^'^ 

< 

(5) X--^ 3.5 

(6) SxJ 78 

(7) £(-X^)^ 78 
(8) 

(9) KxJ - \) 

(10) i:(X^) - l\ 64 

(U) 21 (Xj) ^5 

(12) E(-1)^(X^) 0 
^ 9 

(13) I (Xj - 3) 66 
i-1 ^ 

4 , 4 

(14) Z X? - S (j; 66 
i»l ^ 1-1 

4 , 
Wote: £ (3) means find the sum of four 3 1 

i»l 



6 



Expression (Continued) Answer 

(15) E(X^ - X) 0 

- X)^ 

(16) y. 29 

N-1 3 

ECX? - 2X,X + X^l 

(17) 4 1 29 

2 -2 

ex: - NX'' 

(18) ^ 

Definitio n 1,1 . The variance of X where X - X^^ , X^ , . . . , X^^ , is 
/ defined In one of two ways; 

/ N 

Kx -x)^ 

2 i-lA__ 
■ . N 

or 

^ - 2 
E(X -X)"^ 

2 1»1 

N-1 

The reason for the t^^o definitions will be explained in Chapter III. 
The variance formulas provide measures of how ouch the values of X vary 
(deviate) from the average. The square root of the variance of X is 
called the standard deviation of X. The central role that the abcve 
definitions of variance and standard deviation play in samplinp theory 
will. become apparent as you study sampling. The variance of an estimate 
from a sample is one of the measures needed to Judge th^ accuracy of the 
estimate and to evaluate alternative saraplinp. designs. Much of the alRebra 
and notation In this chapter is related to computation of variance. For 



7 

conplex sampling plans, variance formulas are complex. This chapter 
should help make the nathematlcs used in sampling more readable and no»-e 
meaningful when It is encountered. 

Definition 1.2 . "Population" is a statistical term that rcfe-a to 
a set of elements from which a sample is selected ("Universe" is often 
used Instead of "Population"). 

Some examples of populations are farms, retail stores, students, 
hcniseholds, manufacturers, and hospitals. A eotr.plete definition of a 
population is a detailed specification of the elements that compose it. 
Data to be collected also need to be defined. Problems of defining popu- 
lations to be surveyed should receive much attention in courses on sampling. 
From a defined population a sample of elements is selected, information 
for each element in the sample is collected, and inferences from the sam- 
ple are made about the population. Nearly all populations for sample 
surveys are finite so the mathematics and discussion in this monograph 
are limited to finite populations. 

In the theory of sampling, it is Important to distinguish oetween 
data for elements In a sample and data for elements in the entire popula- 
tlon. Many writers use uppercase letters when referring to the population 

• and lowercase letters when referring to a sample. Thus \^ Xjj would 

represent the values of some characteristic X for the N elements of the 
population; and Xj^,..., x^ would represent the values of X in a sample of 
n elements. The subscripts In x^,..., x^^ simply index the different 
elements in a sample and do not correspond to the subscrlpfa in X^^,..., X^^ 
which index the elements of the population. In other woxus, x^ could be 
any one of the X^*8. Thus, 

' ERIC 



8 



N ^ 
£ X 

i ; 

N 



I 



« X represents the population mean, and 



n 




■ X represents a sample mean 



n 



In this chapter we will be using only uppercase letters, except for 
constants and subscripts, because the major emphasis Is on symbolic repre- 
sentation of data for a set ot" elements and on algebra. For -his purpose, 
it is sufficient to start with data for a set of elements and not be 
concerned with whether the data are for a sample of elen^nts or for all 
elements in a population. 

The letters X, Y, and Z are often used to represent different charac- 
teristics (variables) whereas the first letters of the alphabet are commonly 
used as constants. There are no fixed rules regarding notation. For 

example, four different variables or characteristics might be called Xj^, 

th 

X^, X^, and X^. In that case X^^^, night be used to represent the 1 value 
of the variable X^^. Typically, writers adopt notation that Is convenient 
for their problems. It is not practical to completely standardize notation. 

.^'IS.t^-l'l^A'-^* In the list of expressions In Exercise 1.1 find the 

2 

variance of X, that is, find S . Suppose that X^ is 15 instead of 7. How 

2 1 

much is the variance of X changed? Answer: From 9j to 44j . 

Exe rci se 1.3 . You are given four elements having the following values 
of X and Y 



X 



1 



- 2 



^2 - ' 





Y 



1 



- 2 



Y 



2 



3 



Y 



3 



- 1 



Y^ - 14 



15 



BEST con mum 

Find the value of the following expressions s 





Answet 


Expression 


Answer 


(1) EX^Y^ 


107 


(7) 


SX^-EY^ 


-6 


(2) (SX^XSY^) 


280 


(8) 


S(X -YJ^ 

1 1 


74 


(3) E(X^-X)(Y^-Y) 


37 


(9) 


2 2 

s(x:-y:) 


-132 


(4) SX^Y^-NXY 


37 


(10) 


2 2 

exJ-eyJ 


-132 


1 


1.625 


(11) 


[E(X,-Ypi2 


36 


(6) 5:(x^-Y^) 


-6 


(12) 


[SX^l^-[2Yj2 


-204 



1.3 FREQUENCY DISTRIBUTIONS 

Several elements in a set of N might have the same value for some 

characteristic X, For example, many people have the same age. Let X^ 

be a particular age and let N be the number of people in a population 

■5 K 
(set) jf N people who have the age X . Then S N * N where K is the 

J .1-1 

number of different ages found in the population. Also ^N^X^ is the sum 

EN X 

of the ages of the N people in the population and *• represents the 

• average age of the N people. A listing of X^ and Is called the 

frequency distribution of X, since is the number of tines (frequency) 

« 

that the age X^ is found in the population. 

On the other hand, one could let X^ represent the age of the i'^ 

individual in a population of N people. Uotice that j was an index of age 

We are now using i as an index of individuals, and the average age would 

EX X EX^ 

be written as . Note that ES^X^ - EX^ and that - • The 

ERIC 



10 

choice between these two symbolic representations of the ape of people in 
the population is a matter of convenience and nurpose. 

Exercise 1.4 . Suppose there are 20 elements in a set (that is, N » 20) 
and that the values of X for the 20 eler»ents are: 4, 8, 3, 7, 3, 8, '3, 3, 
7-^-2-r^8, 4. 8. 8. 3, 7. 8, 10. 3. 8. 

(1) Xist the values of and N^, where j is an index of the 
values 2, 3, 4, 7, 8, and 10. This is the frequency 
d/stribution of X. 
(2y What is K equal to? 
Interpret and verify the following by making the calculations indicated 

N K 

(3) Z X * Z N X 
i-1 j»l ^ 

EX EN.X. 

(4) « — i-1 » X 

E(X,-X)^ EN^CX-X)^ 

(5) . : ... — . — X . 1 , 

1.4 ALGEBRA 

In arithmetic and elementary algebra, the order of the numbers when 
addition or multiplication is performed does not affect the results. The 
familiar arithmetic laws when extended to algebra involving the summation 
symbol lead to the following important rules or theorems: 
Rule 1.1 E(Xj-Y^+Z^) « EX^-EY^+EZ^ 

or E(X^^+X2^+...-»'Xj^^) - ^'^li+^Xj^*.. .+EXj^^ 

Rule 1.2 EaX^ " aEX^ where a is a constant 

Rule 1.3 E(X^+b) - EX^+Nb where b Is constant 



\ 

\ 



11 

If It Is not obvious that the above equations are correct, write both 
sides of each equation as series and note that the difference between the 
two sides Is a matter '^f the order In which the sumnation (arithnetic) is 
performed. Note that the use of parentheses In Rule 1.3 means that b is 
contained in the series N times. That is, 

t (X.+b) - (X +b)+(X +b)+...+(Xjj+b) 

1-1 ^ 

- (Xj^+X2+. . .+X^) + Nh 

On the basis of Rule 1.1, we can write 
N N N 

r (x>b) - i: X + s b 
i»i ^ 1-1 1-1 

The expression Z b means"sum the value of b. which occurs N times." Therefore, 
•1-1 

N 

I b - Nb. 
1-1 

N 

Notice that if the expression had been Z X^-Kb. then b is an am>xxnt to add 

1 

N ^\ 
to the sum, IX.. 
1 

N ^ N ^ 

In many equations X will appear; for example, E XX^ or E (X^-X) . 

Since X is constant with regard to the summation, EXX^ - XEX^ . Thus, 

EX^ 

E(X -X) - E X,-EX - EX - NX. By definition, X - ~- . Therefore, 
i * i ^ 1 1 

NX - EX, and E(X.-X) - 0. 
1 ^ 1 

N 2 

To work with an expression like E(X 4-b)^ we must square the quantity 

1 ^ 

in parentheses before sumnlng. Thus, 

ERlC IS 



12 



r(x + b)^ « rcxf + 2bx, + b^) 

- i:xj + Z2bX^ + Zh^ Rule 1 

- + 2b2:x + Nb^ Rules 2 and 3 

Verify this resu't by using series notation. Start with <X^+b)2+. . .+(Xj^41,)2 

• It is very i-nportant that the ordinary rules of algebra pertalnlnR to 
the use of parentheses be observed. Students frequently make errors 
because inadequate attention is given to the placement of parentheses or 
to the interpretation of parentheses. Until you become familiar with the 
above rules, practice translating shorthand to series and series to short- 
hand. Study the following examples carefully; 

(1) Z(K^)^ 4 (EX^)^ Xhe left-hand side is the sum of . 

the squares of X^. The right-hand 
side is the square of the sum of X^. 
On the right the parentheses are 
necessary. The left side could 



(2) Z 



\ „ 2 ^*ve been written EX? 



EX* i 
~2~ J^le 1*2 applies. 

N 



(3) rCX^+Y^)^ ^ + ZY^ A quantity in parentheses must be 

squared before taking a sum. 



(^) r(X^ y2) . + zrl Rule l.i applies 

(5) EX^Y^ ft (£X^)(EY^) The left side is the sum of products. 

The right side is the product of 



sums. 



(6) Z(K^-Y^)^ . ZkI - 2ZX^Y^+ZY^ 

N N 

(7) 2a(X.-b) i^ aZK, - ab 

1^) 



13 



N N 
(8) Ea(X,-b) - aEX. - Nab 

N N 
' (9) a[5:X,-bl - elX.-ab 
i i i 

JIO) £X^(X^-Y^) - EX^ - EXJ^ 

Exercise 1 .5. Prove the following: 

In all cased* assume i * 1, 2",..., N, 

(1) S(X^-X) - 0 
X^ ^1 

, ax )^ 

(3) Nr - 

N 

(4) E (aX^-l-bY^+C) - aEX^+blY^+NC 
i*l 

Note: Equations (5) and (6) should be (or become) 
very familiar equations. 

(5) S(X^-X)^ - £xj - NX^ 

(6) UX^-XXY^-Y) « EX^Y^-NXY 

(7) Yj)^ - ^ ECX^+aY^)^ 

a 

(8) Let Y^ - a+bX^, show that Y - a+bX 
and nl - NaCa+ZbX) + b^ SxJ 

(9) Assume that X^ - 1 for N^ elements of « set and that X^ - 0 

for N^ of the elements. The total number of elements in the 

0 N, Nq 

set is N - N^+Nj,. Ut jji- - P and ^ - Q. Prove that 



14 

(10) 2:<X^-.d)^ - E(X^-X)^ + N(X-d)^. Hint! Rewrite (X^-d)^ 

as l(X^-X)+(X-d) J . Recall from elementary algebra that 
2 2 2 

(a+b) " a +2ab+b and think of (X^-X) as a and of (X-d) 
as b. For what value of d is Z(Xj-d)^ a minimum? 
1.5 DOUBLE INDEXES AND SIWMATION 

When there is more than one characteristic for a set of elements, 
the different characteristics might be distinguished by using a different 
letter for each or by an index. For example, X^ and might represent 
the number of acres of wheat planted and the number of acres of wheat 
harvested on the i'^ farm. Or, X^j might be used where i is the index 
for the characteristics and J is the index for elements; that is, X 
would be the value of characteristic X^ for the j^^ element. However, 
when data on each of several characteristics for a set of elements are 
to be processed in the same way, it might not be necessary to use 
notation that distinguishes the characteristics. Thus, one might say 

E(X^-X)^ * 
calculate — for all characteristics. 

More than one index is needed when the elements are classified accord 

ing to more than one criterion. For example, X^^ might represent the value 

of characteristic X for the j'^ farm in the i'** county; or X... might be 

the value of X for the household in the j^*^ block in the i'^ city. 

As another example, suppose the processing of data for farms involves 

classification of farms by size and type. We might let X represent 

ijlc 

the value of characteristic X for the k'^ farm in the subset of farms 
classified as type j and size i. If N^^ is the number of farms classified 



as type j and size i» then 



Z^^ X 
k 



ijk 



N 



ij 



IS 



X. . Is the average value of X for 



the subset of fanns classified as type j and size i. 

There are two general kinds of classification— cross classification 
and hierarchal or nested classification. Both kinds are often involved 
in the sawe problem. However, we will discuss each separately. An 
ejearaple of nested classification is farms within counties, counties within 
States, and States within regions. Cross classification tneans that the 
data can be arranged in two or more dimensions as illustrated in the next 
section. 

1.5.1 CROSS CLASSIFICATION 

As a specific illustration of cross classification and summation with 
two Indexes, suppose we are working with the acreages of K crops on a set 
of N farms. Let X^^ represent the acreage of the i crop on the J farm 

where 1-1, 2,..., K and j - 1, 2 N. In this case, the data could 

be arranged in a K by N matrix as follows: 



; Row (1) ; 


Column (j) ! 


Row ; 


1 


j 




total : 


1 ! 


^11 

• 


• • • ^XJ * • * 


^IN * 




; 1 ; 


• 

» • 


• • • j * * * 


^IN 




: K 


1 • 
» • 

; ^1 


• , • ^Scj * * * 






I Column 
: total 

« 
• 


; I X 

: 1 


[hi 


I hi; 


: IJ "-^ : 



16 



The expression Z X^^ (or r X^^) means the sum of the values of X., for a 



1 



j 



1.1 



fixed value of i. Thus, with reference to the matrix. £ X is tlie total 

i 

Of the values of X in the 1 row; or. with reference to the example about 
farms and crop acreages, I X would be the total acreage on all farms of 
whatever the i'** crop is. Similarly. Z X^^ (or Z X,,) "is the column total 



for the J^^ column, which in the example is the total for the J^^ farm of 

the acreages of the K crgps under consideration. The sum of all values of 

KN 

X could be written as X. . or SE X . 

iJ ^ ij 

Double summation means the sum of sums. Breaki«- -double sum into 
parts can be an important aid to understanding it. Here are two examples: 



(1) n X, - s X, , + z X., +...+ z X 



J 



(1.1) 



With reference to the above matrix, Equation (1.1) exnresses the grand total 
as the sum of row totals. 



KN 



N 



(2) ZZ X^^(Y^^+a) . Z X^^(Yj^^.a) +...+ E Xj^j(yj^j+a) 



(1.2) 



J 



N 

I ^IjC^j^*) - ^iiC^,+a) X^^,(Y^^*a) 

In Equations (1.1) and (1.2) the double sum is written as the sum of K 
partial sums, that is, one partial sum for each value of i. 

Exercise 1.6. (a) Write an equation similar to Equation (1,1) that 
expresses the grand total as the sum of column totals, (b) Involved in 
Equation (1.2) are KN terns, X^jCY^j+a). Write these terms in the form of 
a matrix. 



ERIC 



23 



BBT con mum 



17 



Tlte rules given In Section l.A also apply to double summation. 

Ihus, 

KJJ KN KN 

ZZ X,.(Y,.+a) - ZZ X.. Y + a IS X . (1.3) 

ij ij Ij 

Study Equation (1.3) with reference to the matrix called for In Exercise 
1.6(b). To fully understand Equation (1.3). you might need to write out 
Interraediate steps for gettinR from the left-hand side to the riRht-hand 
side of the equation. 

To simplify notation, a system of dot notation is commonly used, for 

example: 



ZZ X « X 

ij 



Tlie dot in X, indicates that an index in addition to i is involved and 
i • 

X^, is interpreted as the sum of the values of X for a fixed value of i. 

Similarly, X^^ is the sum of X for any fixed value of j, and X,^ represents 

a sum over both indexes. As stated above, averages arc indicated by use of 

a bar. Thus X, is the averaRe of X,, for a fixed value of i, namely 
• i* ij 

N 

^ '^ij 

ili^^- — • X. and X would represent the average of all values of X . , 
N i • • • ' 

" "13 N 
naaely ^^y^ • . 

* 

I Here is an example of how the dot notation can sinpllfv an algebraic 

expression. Suppose one wishes to refer to the sum of the squares of the 

2 

. row totals in the above matrix. This would be written as U^^) • Tbe sum 
hl\L(- J I 



18 



of squares of the row means would be !^(X^^) . Without the dot notation the 

1 . ^ 



K N 2 K 

corresponding expressions would be TCSX, J and I 



i i 



EX 
N 



ij 



It is very 



K N 2 

important that the parentheses be used correctly. For example, E(EX..) is 

i J ^ 

KN 2 

not the same as EEX . Incidentally, what is the difference between the 

iJ 

last two expressions? 

Using the dot notation, the variance of the row means could be written 
as follows: 

2 



V(X,.) . 



sex -X.,) 
i 

K-1 



(1.4) 



where V stands for variance and V(Xj ) is an expression for the variance of 
X^^ , Without the dot notation, or something equivalent to it, a formula 
for the variance of the row means would look much more complicated. 

Exercise 1.7. Write an equation, like Equation (1.4), for the variance 
of the column means. 



Exercise 1.8. Given the following values of X 



i ; 


: J 


1 


• 

: 2 

• 


• 
• 

• 


3 


: A 


1 ! 


; 8 


11 




9 


14 


2 ! 


! 10 


13 




11 


14 


3 ! 


12 


15 




10 


17 



19 



Find the value of the following algebraic expressions: 

Express ion Answer Express ion Answer 

N 



(1) 


N 




42 


(9) 




N 

IX 

N 




12 


(10) 


(3) 






13.5 




(4) 






45 


(U) 


<5) 


KN 

EH X • M 




144 


(12) 


(6) 


mm 




12 




(7) 


KN 
iJ 


2 


78 


(13) 


(8) 


K 

NE (X, 
i 


2 


18 


(14) 



KN 



- .2 
C. -X )'' 

J* • • 



54 



2 



^^(X^j-X^,-X, +X..)^ 
pCN 

KN 2 

ij 



78 



N KN 

N 
S( 
j 

KN 



18 



21 



60 



Illustration 1»1 . To introduce another aspect of notation, refer to 
the matrix on Page 15 and suppose that the values of X in row one are to 
be multiplied by a^^, the values of X in row two by etc. The matrix 
would then be **1^11 "' ^'I'^IJ *** ^I'^IN 



®i^ll ^i^ij ^i'^iN 



V\i ••• *k\j ^c'Sw 

The general term can be written as a^X^^ because the index of a and the 



ERIC 



Index 1 in X are the same. The total of all KN values of a,X. . is 



• Since is constant with respect to stmanation involving j , 

N 

we can place a. ahead of the suitanatlon svmbol £ . That is, ZU,\.. - 



J!a . EX - . • 

i 



Exercise 1.9. Refer to the matri;; of values of X in Exercise 1.8. 

J 



Assuiae that ' -1, m 0, and a^ - 1. 

Calculate: 

(1) rsa X 
ij 

(2) ZZ --^^ 
ij ^ 

(3) ria.xj Answer :-296 
ij J 

Show algebraically that: 

(4^ 22a X « EX»,-SX,, 
ij ^ j M ^ 

(5) ZZ - X, -X, 

(6) Ella xj - EX^ -ZxJ 

Exercise 1.10 * Study the following equation and if necessary write 
the summations as series to be satisfied that the equation is correct: 
KN 

£E(aX,,+bV,,) m aZZX.. + bESY, . 
ij ij ij 

Illustration 1.2 . Suppose 

^ij " ''ij**i"**j'^*^ where i « 1, 2,...,K anc j - 1, 2,...,N 



21 

The values of Y^^ can be arranged In matrix format as follows: 

• • • 

Notice that &^ Is a quantity that varies from row to row but is constant 
within a row and that b^ varies from column to column but Is constant 
within a column. Applying the rules regarding the sumnatlon symbols we 
have 

- EX + Na + Eb Nc 
J ^ J 

^ iJ ij i J 

-EX •♦■ Ea + Kb -HCc 
i ^ i ^ 

EEY, . " EE(X, ,+a,'H),+c) 
IJ IJ ^ ^ 

- IJX,, + NJa, + KEb, + KNc 
IJ i * 

Illustration 1.3 . We have noted that E(X^Y^) does not equal 

(EX^XEY^). (See (1) and (2) in Exercise 1.3, and (5) on Page 12). But, 

EEX.Y. - (EX )(EY ) where i - 1, 2 , . . . ,K and J - 1, 2,...,N. This becomes 
IJ ^ i j ^ 

clear if we write the terms of EEX Y l.i matrix format as follows: 

iJ J 

Row Totals 



o 

ERIC 



22 

Th« sum of Che terms in each row Is shown at Che right. The sum of theise ~ 

row totals is X^^EY^ +. . .+ X^SY^ (Xj^-»-...+ \}^^ ^ " ^^i^^j' 

get the same final result by adding the columns first. Very often inter- 

nedlate summations are of primary interest. 

Exercise 1.11 . Veriify that SEX Y « (EX )(EY ) using the values of 

ij ^ ^ J- J 

X and Y in Exercise 1.3. In Exercise 1.3 the subscript of X and the 8ub~ 

script of Y were the same index. In the expression EEX.Y. that is no longer 

ij ^ ^ 

the case. 

Exercise 1.12 . Prove the following: 

KN -KjN, KN N, 

(1) EE(a,X,,+b,)'' - EaT EXT, ^ 2Ea. Eb.X,, + KEbf 
ij i ij j i i j i ^ j J j J 

KN . K N K 5 

(2) EEa, (X,,-X, ) - £a, Exf, - NEa.xf 
ij i ij 1- i i j 1 i 

KN K N K 

1,5.2 HIERARCHAL OR NESTED CLASSIFICATION 

A double index does not necessarily imply that a meaningful cross 

classification of the data can be made. For example, X^^ might represent 

the value of X for the j*^^ farm in the i*^^ county. In this case, j simply 

identifies a farm within a county. There is no correspondence, for exrmple 

between farm number S in one county and farm number 5 in another. In fact 

the total number of farms varies from county to county. Suppose there are 

K counties and farms in the i^^ county. The total of X for the i'^ 

K 

county could be expressed as X. « E X . In the present case ^X. . is 

j ^ i 
KN. 

meaningless. The total of all values of X is EE X . 

ij ^ 



23 



When Che classification is nested, the order of the subscripts 
(indexes) and the order of the sunmation symbols from left to right should 
be from the highest to lowest order of classification. Thus in the above 
example the index for farms was on the right and the summation symbol 

^i„ 

involving this index is also on the right. In the expression ESTC^ , 

ij 

summation with respect to 1 cannot take place before summation with regard 
to j. On the other hand, when the classification is cross classification 
the suramatibns can be performed in either order. 

In the example of K couaities and farms in the 1^^ county, and in 
similar examples, you may think of the data as beinf? arranged in rows (or 
columns) : 

^11* ^12 » ' hu^ 
hi* hi* "' ' hn^ 



Here are two double sums taken apart for inspection: 



(1) E?^(X,,-X..)^ - ^(X,,-X..)^ +...+ E^(X^ -X..)* (1 
ij j j ^ 

^ ' y ^ 

?^(Xij-x..)^ - (x^^-x..)2 +...+ (X^j, -x..)^ 

Equation (1.5) is the sun of squares of the deviations, (X^^-X,,), of all 

K 

values of X^. from tU, overall mean. There are £N. values of X.., and 
ij ^ ^ 

30 



24 

KN^ 

X^^ m J » If there was no Interest in Identifying the data by counties » 

EN 
1 

^ - 2 

a single Index would be sufficient. Equation (l.S) would then be £(X -X) . 

1 * 



(1.6) 



( ^ ) 

^1 - 2 

With reference to Equation (1.6) do you recognise Z (X. -X. ) ? It Involves 

J ^' 

only the subset of elements for which 1 •> 1, namely X^^, X^^**** ^iji • 

1-2 

that X^^ Is the average value of X In this subset. Hence, £ (X^^-X^^) is 

Che sum of the squares of the deviations of the X*s In this subset from the 

subset mean. The double sum is the sum of K terms and each of the K terms 

Is a sum of squares for a subset of X's, the index for the subsets being i. 

Exercise 1.13 . Let X^^ represent the value of X for the j^^ farm In 
th 

the 1 county. AJ.so, let K be the number of counties and be the number 



th 

of farms in the 1 county 


• Suppose 


the values of X are as 




"l2- 1 


X 5 
*13 ^ 


"21 " * 






"31 " " 


"32-5 




Find the value of the following expressions: 


Expression 




Answer 


K 

(I) ZN 
1 * 




9 



31 



25 



Expression (Continued) 



Answer 



(2) 




27 




(3) 


X and X 


77 
^ 1 




(4) 




9 




(Si 


V AnH X 




B 


(6) 




3 


5 


(7) 








EN, 


3 





^ ^1 2 2 
(8) or Sx: 



2A5 



(9) IZ(X. -X..)'' 
ij 

(10) E^(X, -X, y 



36 



8 



''i - 2 
(U) E^(X,,-X, )^ 

J ^' 

KN 

(12) sr(x, -X, r 

ij ^* 



8. 2. and U for i - 1. 2, 
and 3 respectively 



24 



(13) 2:n,(x, -X )' 



(14) E 



1 ^1 



EN. 



12 



12 



ERIC 



(15) EN.X? -NX^. 
^ 1 I* • 



12 



26 



Expressions (14) and (IS) In Exercise 1.13 are synbollc representations 
the sai» thing. By definition 



Z - X, , X.. "X , and EN • N 
J IJ 1« ' Ij ' ^1 



Substitution in (lA) gives 

2 2 

E^--^ a.7) 

Also by definition » — - X. and r; — » X . • Therefore 5 — » N.X. and 

!• N • • 1 i* 

-2 -2 -2 

- NX . hdnce, by substitution, Equation (1.7) becomes EN.X? - NX*. 

1 

Exe rcise I.IA . Prove the following: 

KN K 2 

(1) IZ^X. X . - sxj 

ij ^ 1 

^1- 

(2) Erx. (X.,-X, ) - 0 

Ij ^ ^• 

(3) LN,(X, -X )^ - EN.X? -NX^ 
^ 1 1- ^11. 

Note that this equates (13) and (IS) in Exercise 1.13. 
The proof is similar to the proof called for in part (S) 
of Exercise l.S. 

1.6 THE SQUARE OF A SUM 

In statistics, it is often necessary to work algebraically with the 
square of a sua. For example, 

(EXj ) ^ • (X^+X2+. . .+Xjj) ^ * xJ+Xj^X^*. . .+X2+X2X^+. . .•♦•X^+Xj^j^+. . . 

o 33 
ERIC * 



BEST con mMs^ 



The tenns in the square of the sum can be written in matrix fonti as 
follows: 

Xj^Xj^ X<|^X2 • • • ^1^^ * * * ^j^^^ 

X2Xj^ ^2^2 * * * j * * * ^^2^ 

• • • • 

• • • • ' 

• • • • 

^i^l ^1^2 * * * ^-^^j * * * ^i^'S^ 



The general term in this matrix is X^X^ where X^^^ and X^ come from the same 
set of X's, namely, Xj^,...,X^. Hence, 1 and j are indexes of the same set 

Note that the terms along the main diagonal are th*e squares of the value 

,2 

i • 

and X^Xj - X^X^^ » X^ . The remaining terms are all products of one value 
of X with some other value of X. For thwse terms the indexes are never 
equal. Therefore, the sum of all terms not on the main diagonal can be 
expressed as EX X where i ^ j is used to express the fact that the suiroa- 

tion includes all terms where i is not equal to j, that is, all terms other 

2 

than those on the main diagonal. Hence, we have shown that i^^^) " 

ExJ + }:x,x. . 

i i^ji J 

Notice the symmetry of terms above and below the main diagonal: 
" ^2^1 'Va " ^3^1 * syumetry like this occurs, instead of 

2X.X. you might see an equivalent expression 2S X.X. . 

terms above the main diagonal is 2 X.X. . Owing to the symmetry, the sum 

i<j ^ • 



of X and could be written as SxJ . That is, on the main diagonal i - j 

.2 



The sum of all 



Rir 



28 

of Che cerms below the main diagonal is the same* Therefore, t X.X. 
2 2 X.X. . 

A 

Exercise 1.15 , Express the terms of ( 2X • {X,+X,+X-+X. ]^ in 

i-1^ i ^ J 

matrix format. Let Xj^ - 2, X^ * 0, X^" * 5, and X^ - 7. Compute the values 

of SX? , 2 Z X.X, , and [SXJ^ . Show that [EX,]'^ • EX? + 2 E X.X, . 
1 i<j i J i i i i<j i J 

An important result, which we will use in Chapter 3, follows from the 
fact that 

lEXJ^ - ExJ + E X.X, (1.8) 
* ^ ii*j ^ ^ 

Let Xj - Y^-y. Substituting (V^-Y) for X^ in Equation 1.8 we have 
lE(Yj^-Y)J^ - E(Y^-Y)^ + E (Y^-Y) (Y^-Y) 

we know that [E(Y^.Y)j2 - 0 because E(Y^.Y) - 0. Therefore, 
E(Y.-Y)^ + E (Y,-Y)(Y,-Y) - 0 

It follows that E (Y.-y)(Y,-Y) - -ECY.-Y)^ (1.9) 

i^i ^ ^ ^ 

Exercise 1.16 . Consider 

E (Y -Y)(Y,-Y) - E (Y.Y, - YY, - YY, + Y"^) 
i J 1 j i j 

- E Y Y, - Y E Y - Y E Y, + E Y^ 

i^i ^ ^ ijij ^ i»»j ^ lyj 

-2 -2 

Do you agree that E Y « N(N-'1)Y 7 With reference to the matrix layout, 
-2 2 

Y appears N times but the specification is i ft j so we do not want to 

-2' 

count the N times that Y is on the main diagonal. Try findlnr, the values 

of E X. and E X, and then show that 
i4i * ^ 



3ry 

ERIC 



29 

Z (Y,-Y)(Y,-Y) - Z Y.Y. - M(N-1>Y^ 
1-3 > ^ ^ ^ 

Hint: Refer to a matrix layout. In Z Y. how many times does Y, appear? 

Does Y^ appear the same nussdber of times? 
1.7 SUMS OF SQUARES 

For various reasons statisticians are interested in components of 
, variation, that is, measuring the amount of variation attributable to each 
of more than one source. This Involves computing sums of squares that 
correspond to the different sources of variation that are of interest. 
We will discuss a simple example of nested classification and a simple 
exanq>le of cross classification, 
l./.l NESTED CLASSIFICATION 

To be somewhat specif ic, reference la qiade to the example of K counties 
and farms in the i'** county. The sum of the squares «f the deviations 
of X.- and X,, can be divided into two parts as shown by the following 
formula: 

2r^x..-x ) - rN (X -x..)^ + rs^(x..-x )^ (i.io) 

The quantity on the left-hand side of Equation (1.10) Is called the 
total sum of squares. In Exercise 1.13, Part (9), the total sum of squares 
was 36. 

The first quantity on the right-hand side of the equation involves the 
squares of (X^^-X, J, which are deviations of the class means from the over- 
all mean. It is called the between class sun of squares or with reference 
to the example the between county sum of squares. In Exercise 1.13, 
Part (13), the between county sum of squares was computed. The answer was 
12. 



30 



The last tenn is called the within sum of squares because It involves 
deviations vlthln the classes from the class means* It was presented 
previously. See Equation (1.6) and the discussion pertaining to It. In 
Exercise 1.13, the within class stan of squares was 24. which was calculated 
in Part (12). Thus, from Exercise 1.13, we have the total sum of squares, 
36, which equals the between, 12, plus the within, 24. This verifies 
Equation (1.10). 

The proof of Equation 1.10 is easy if one gets started correctly. 
Write X^j-X., - (X^^-X^,) +(X^ -X,.). This simple technique of adding and 
subtracting X^^ divides the deviation (X^^-X..) into two parts. The proof 
proceeds as follows: 

2:r(x. .-x..)^ « ESKX, -X, ) + (X, -X )r 
1.1 1.1 ^' ^' " 

- ^^^^^^^if\y * 2(X^^-X^.)(X^^-X,,) (X^.-X..)2] 

- ej:(x -X )^ + 2j:5:(x ~x )(x -x ) + zhk. -x..) 
1.1 1.1 1- 1- 

Exercise 1.X7 . Show that 5^5: (X^ ^-X. )(X^ -X ) " 0 
ij 1* 1« 

and that ZZ^(X, -X )^ - S (X^ -X )^ 
ij ^' " i ^ ^' " 

(k>inpletion of Exercise 1.17 completes the proof. 

Equation (1.10) is written In a fona which displays its meaning rather 
than in a form that is most useful for computational purposes. For computa- 
tion purposes, the following relationships are commonly used: 

Total » 2:E^(X, ~X )^ » srxJ,-N3(^ 
ij " ij " 



ERIC 



31 



Between - IN, (X, -X )^ - IN.X? -NX^ 



Within - EE^(X, ,-X, )^ - LSxJ -ZN.X? 

ij ^' 1.1 1 ^ 

^1 *^i„ 
K ^ ^1 j IJ 
where N - EN, > X, * - , and X » 

^ r 

Notice that the major part of arithmetic reduces to calculating X. . , 

1.1 

£N.xj , and NX . There are variations of this that one might use. For 
.11' • • 

2 

K X^, K ^2 

example, one could use Z Instead of EN.X. , 

1 "l 1 

Exercise 1.18 . Show that 

'^l - 2 2-2 

Es''(x^ -X, ) - sex: -ZN.x: 

iJ *J ij 1 ^ ^* 

A special case that Is useful occurs when ■ 2, The within sum of 
squares becomes 

ij ^ i 

Since « \ it is easy to show that 
i* 2 

Therefore the within sun of squares is 

1 ^ 2 

2 I ^''irt2^ 

which is a convenient form for computation. 

ERIC 



32 



1.7.2 CHOSS CLASSIFICATION 

Reference is made to the matrix on Page 15 and to Exercise 1.8. The 
total sum of squares can be divided into three parts as shown by the 
following formula: 

KN 2 K 2 2 ^ J 

iW^-l^X " nj:(x^^-x_) + ia:(x j-x^)*^ + ee(x^^-x^ ^x^^+x (i.ii) 

Turn to Exercise 1.8 and find the total sum of squares and the three 
parts* They arc: 

S u m of Sq uares 



Total 78 

RowB 18 

Columns 54 

Remainder 6 



The three parts add to the total which verifies Equation (1.11). In 
Exercise 1.8« the sum of squares called remainder was computed directly 
(see Part (10) of Exercise 1.8). In practice^ Che remainder sum of squares 
is usually obtained by subtracting the row and column sum of squares from 
the total. 

Again, the proof of Equation (1.11) is not difficult if one makes the 

right start. In this case Lhe deviation* (X. .-X ), is divided into three 

i.1 • • 

parts by adding and subtracting X^^ and X^^ as follows: 

<Xij-X..> - (X^.-X..) + (X.^-X..) + (X^^-X^,-X.^+X..) <1.12) 

Exerc lBe 1.19 . Prove Equation (l.Xl) by squaring both sides of Equa- 
tion (1.12) and then doing the sunimation. The proof is mostly a matter of 
showing that the sums of the terms which are products (not squares) are zero. 

KN 

For example, showing that IW, -X )(X,.-X. -X .+X .) • 0 . 



33 

CHAPTER II. RANDOM VARIABLES AND PROBABILITY 

2.1 RANDOM VARIABLES 

The word "random" has a wide variety of meanings. Its use in such 
tenns as "random events," "random variable," or "random sample," however, 
implies a random process such that the probability of an event occurring, 
is known a priori. To select a random sample of elements from a population, 
tables of random numbers are used. Th«re are various ways of using such 
tables to make a random selection so any given element will have a specified 
probability of being selected. 

The theory of probability sampling is founded on the concept of a 
random variable which is a variable th.it, by chance, mxght equal ^ny one 
of a defined set of values. The value of a random variable on any partic- 
ular occasion is determined by a random process -in such a way that the 
chance (probability) of its belns equal to any specified value in the set 
is known. This is in accord with the definition of a probability samnle 
which states that every element of the population must have ^ kno^m prob- 
ability (greater than zero) of being selected. A primary purpose of this 
chapter is to present an elementary, minimum introduct'.on or review of 
probability as background for the next chapter on expected values of a 
random variable. This leads to a theoretical basis for sampling and for 
evaluating the accuracy of estimates from a probability-sample survey. 

In sampling theory, we usually start with an assumed population of N 
elements and a measurement for each element of som characteristic X. A 
typical mathematical representation of the N measurements or values is 
Xj.... .X^,... ,Xj^ where is the value of the characteristic X for the i 
element. Associated with the i^'^ elen«nt is a probability , which is the 
O probability of obtaining it when one element is selected at random from the 



th 



34 

set of M. The ?^*b will be called selection probabilities. If each 
element has an equal chance of selection, P. - ^. The ?*a need not be 

IN 1 

equal, but we will specify that each P^>0. tThen referring to the probabllitv 
of X being equal to we will use P(X^) instead of P^. 

We need to be aware of a distinction between selection probabllitv 
and inclusion probability, the latter being the probability of an element 
being included In a sample. In this chapter, nuch of the discussion is 
oriented toward selection probabilities because of its relevance to finding 
expected values of estimates from samples of various kinds. 

Definition 2.1 . A random variable is a variable that can equal any 
value X^, in a defined sat, with a probability P(X^). 

\/hen an element is selected at random from a population and a measure- 
ment of a characteristic of it is made, the value obtained is a random 
variable. As we shall see later, if a. sample of elements is selected at 
random from a population, the sample average and other quantities calculated 
from the sample are random variables. 

Illustration 2.1 . One of the most familiar examples of a random 
variable is the number of dots that happen to be on the top side of a die 
when It^ comes to rest after a toss, .his also illustrates the concept of 
probability that we are Interested in; namely, the relative frequency with 
which a particular outcome will occur In reference to a defined set of 
possible outcomes. With a die there are six possible outcomes and we expect 
each to occur with the same frequency, 1/6, assuming the die is tossed a 
very largi? or infinite number of times. Implicit in a statement that each 
side of a die has a probability of 1/6 of beinp, the top side arc some 
assumptions about the physical structure of the die and the "randomness" 
O I the toss. 

JC 41 



35 

The additive and multiplicative laws of probability can be stated in 
several ways depending upon the context in which they are to be used. In 
sampling, our interest is primarily in the outcome of one random selection 
or of a series of random selections that yields a probability sample. 
Hence, the rules or theorems for the addition or multiplication of prob- 
abilities will be stated or discussed only in the context of probability 
sampling. 

2.2 ADDITION OF PROBABILITICS 

Assume a population of N elements and a variable X which has a value 
for the i^^ element. That is, we have a set of values of X, namely 

X^,... .X^.... .Xj^. Let P^,...,P^ be a set of selection probabilities 

el) 

where is the probability of selecting the i element when a random 
selection is made. We specify that each ?^ must be greater than zero and 
N 

that IV^ - 1. Wlien an element is selected at random, Che probability that 
it is either the i element or the j element is + P^. This addition 
rule can be stated more generally. Let P^ be the sum of the selection 
probabilities for the elements in a subset of the N elements. When a random 

selection is made from the whole set, is the probability that the elerr.ent 

s 

selected is from the subset and l-P is the probability that it is not from 

s 

the subset. With reference to the variable X, let P(X^) represent the 
probability that X equals X^ . Then P(X^)+P(Xj) represents the probability 
that X equals either X or X ; and P (X) could be used to represent the 
probability that X is equal to one of the values in the Subset. 

Before adding (or subtracting) probabilities one should determine 
whether the events are mutually exclusive and whether all possible events 
have been accounted for. Consider two subsets of elements, subset A and 

er|c ^ilJ 



36 

subseC B, of a population of N elements. Suppose one elcanent is selected 
at random. What is the probability that the f^elected element is a member 
of either subset A or subset B? Let P(A) be the probability that the 
selected element is from subset A; that is, P(A) is the sum of the selec- 
tion probabilities for elements in subset A. P(B) is defined similarly. 
If the tvo subsets arc mutually exclusive, which means that no element is 
in both subsets, the probability that the element selected is from either 
subset A or subset B is P(A) + P(B). If some elements are in both subsets, 
see Figure 2.1, then event A (which is the selected element being a member 
of subset A) and event B (which is the selected element being a member of 
subset B) are not mutually exclusive events. Elements included in both 
subsets are counted once in P(A) and once in P(B). Therefore, we must 
subtract P(A,B) from P(A) + P(B) where P(A,B) is the sum of the probabilities 
for the elements that belong to both subset A and subset B. Thus, 
P(A or B) - P(A) + P(B) - P(A,B) 




Figure 2.1 

To summarize, the additive law of probability as used above could be 
stated as follows: If A and B arc subsets of a set of all possible outcomes 
that could occur as a result of a random trial or selection, the probability 

43 



BEST COPY AVAIiiBii 37 

that the outcome is in subset A or In subset B Is equal to the probability 
that the out cone is in A plus the probability that it is in D minus Che 
probability that it is in both A and B. 

The additive law of probability extends without difficulty to three 
or more ^ubsets. Draw a figure like Figure 2.1 with three subsets so that 

Bom& poiijits are cotnw&n to all three subsets.. Observe that the additive 

i 

law extends to three subsets as follows: 

P(A or B or C)-P(A)+P(B)+P(C)-P(A,B)-P(A,C)-P(B.C)+P(A,B,C) 
As a case for further discussion pumoses, assume a population of N 
elements and two criteria for classification. A two-way classification of 
the elements could be displayed in the format of Table 2.1. 

Table 2.1 — A two-way classification of N elements 





X class 1 




: Y class : 


1 


• • • J • ' • 


s : 


Total : 


: 1 : 

• • 

• • 


• * 

• « 


^ij*^i.i 


N, ,P, : 
Is is 




• • 

: i 

• ft 


• • • 


... N^^,Pj^^ ... 


.P. 
is is 




3 

i 

t t 


* « • 

* • • 


... N'cj.Pt^ ••• 




t» t • 


i Total 


t N.l 


N.j 


N.s 


: N,P-1 



The columns represent a classification of the flements in terms of criterion 
X; the rows represent a classification in terms of criterion Y; is the 
number of elements in X diss j and Y class i; and P.. is the sum of the 



38 

selection ptobabilities for the elements In X class j and Y class i. Any 
one of the N elements can be classified in one and only one of the t tiines 
s cells. 

Suppose one eletnent fron the population of N is selected. According 
to the additive law of probability we can state that 

" probability that the clement selected is from 

X class j» and 

^^{t ' ^i. the probability that the element selected is from 
Y class i, where 

is the probability that the element selected is from 
(belonRS to both) X class j and Y class i. 

The probabilities P ^ and P^^ are called marginal probabilities. 

The probability that one randomly selected element is from X class 

J or fron Y-class i is P.^ + P^^ - P^^. (The answer is not P.^ -f P^^ because 

in P.J + P^^ there are N^^ elements in X class j and Y class 1 that are 

counted twice.) 

N N 

If the probabilities of selection are equal, ■? - , p • ->1 

i1 N • • j N • 

and P. " — ^ . 
i» N 

iilystr^ionj^- Suppose there are 5,000 students in a university. 

Assume there are 1.600 freshmen, 1,400 sophomores, and 500 students living 

in dormitory A. From a list of the 5,000 students, one student is selected 

at random. Assuming each student had an equal chance of selection, the 

probability that the selected student is a freshman is , that he is a 

sophomore is — ~ , and that he is either a freshman or a sophomore is -J—^ + 

5000 

1400 ^ , 

Jqq^ . Also, the probability that the selected student lives in dormitory A 



ERIC 



39 



122^ , But, what is the probability that the selected student is either 
5000 

a freshman or lives in dormitory A? The question involves two classifica- 
tions: one pertainin<» to the student's class and the other to where the 
student lives. The information piven about the 5000 students could be 
arranged as follows: 



: Dormitory : 


Class \ 


Total : 


Freshmen Sophomores Others J 


: A ; 
J Other 




; 500 : 
■ 4500 : 


: Total 


! 1600 1400 2000 


: 5000 : 



From the above format, one can readily observe that the answer to the ques- 
tion depends upon how many freshmen live in dormitory A. If the problem 
had stated that 200 freshmen live in dormitorv A, the answer would have 



1600 ^ 500 
been -=-rrr + 



200 



ERIC 



5000 5000 " 5000 * 
Statements about probability need to be made and interpreted with 
great care. For example, it is not correct to say that a student has a 
probability of 0,1 of livinp, in dormitory A simply because 500 students out 
of 5000 live in A. Unless students are assigned to dormitories by a random 
process with known probabilities there is no basis for statinp, a student's 
probability of living in (beinp, assigned to) dormitory A. We are consider- 
ing the outcome of a random selection. 

Exercis e 2.1. Suppose ohk: has the following information about -i 
population of 1000 farms: 



40 

600 produce com 

500 produce soybeans 

300 produce wheat 

100 prodoce wheat and com 

200 have one or more cows 

all farms that have cows also produce com 
200 farms do not produce any -crops 

m 

One farm is selected at random with equal probability from the list 
of 1000, \J\\at is the probability that the selected farm, 

(a) produces com? Answer: 0.6 

(b) does not produce wheat? 

(c) produces com but no wheat? Answer: 0.5 

(d) p-roduces com or wheat but not both? 

(e) has no cows? Answer: 0.8 

(f) produces corn or soybeans? 

(r) produces com and has no cows? Answer: 0.4 

(h) produces either corn, cows, or both? 

(i) does not produce corn or wheat? 

One of the above questions cannot be answered. 
Exercise 2.2. Assume a population of 10 elements and selection 
probabilities as follows: 



Elenent 


h 




Elenent 




!i 


1 


2 


.05 


6 


u 


.15 


2 


7 


.10 


7 


2 


.20 


3 


12 


.Of3 


n 


8 


.05 


A 


0 


.02 


9 


6 


.05 


5 


8 


.20 


10 


3 


.10 



ERIC 

i 



41 



One element Is selected at randon with probability P^. 
Find: 

(a) P(X-2).. the probability that X - 2. 

(b) P(X>10), the probability that X is greater than 10. 

(c) P(X<2), the probability that X is equal to or less than 2. 

(d) P(3<X>10), the probability that X is greater than 3 and less 
than 10 

(e) P(X<3 or X>10) , the probability that X is either equal to or less 
than 3 or is equal to or greater than 10. 

Note: The answer to (d) and the answer to (e) should add to 1, 
So f..r, we have been discussirg the probability of an event occurring as 
a result of a single random selection. When more than one random selection 
occurs simultaneously or in succession the multiplicative law of prob- 
ability is useful. 

2.3 MULTIPLICATION OF PROBABILITIES 

Assume a population of N elements and selection probabilities 

M 

P ,P, P„. Each P. is greater than zero and EP^ - 1. Suppose 

1' 1 w * i * 

two elements are selected but before the second selection is made the 
first element selected is returned to the population. In this case the 
outcome of the first selection does not change the selection probabilities 
for the second selection. The two selections (events) are independent. 
The probability of selecting the i'** element first and the j^** element 
second Is, P^P^, the product of the selection probabilities P^ and ?y 
If a selected element is not returned to the population before the next 
selection is made, the selection probabilities for the next selection are 
changed. The selections are dependent. 



42 

The multiplicative law of probability, for two independent events 
A and B, states that the joint probability of A and B happening in the 
order A,B is equal to the probability that A happens tines the prob- 
ability that B happens. In equation fonii,PviiB) P(A)P(B). For the 
order B,A, P(BA) - P(B>P(A) and we note that P(AB) - P(BA), Remember, 
independence means that the probability of B happening is not affected 
by the occurrence of A and vice versa. The multiplicative law extends 
to any nund>ftr of independent events. Thus, P(ABC) • P(A)P(B)P(C) , 

For two dependent events A and B, the multiplicative law states that 
the Joint probability of A and B happening in the order A,B Is equal to 
the probability of A happening times the probability that B happens under 
the condition that A has already happened. In equation form P(AB) - 
P(A)P(B|a); or for the order B,A we have P(BA) - P(B)P(a|b). The vertical 
bar can usually be translated as "given" or "given that," The notation on 
the left of the bar refers to the event under consideration and the nota- 
tion on the right to a condition under which the event can take place. 
P(BjA) Is called conditional probability and cc d be read "the prob- 
ability of B, given that A has already happened," or simply "the prob- 
ability of B given A." When the events are independent, P(B| A) «• P(B); 
that is, the conditional probability of B occurring le the same as the 
unconditional probability of B. Extending the multiplication rule to a 
series of three events A,B,C occurring In that order, we have P(ABC) « 
P(A)P(b|a)P(C|AB) where P(C|aB) is the probability of C occurring, given 
that A and B have already occurred. 
2.4 SAMPLING WITH REPLACEMENT 

When a sample Is drawn and each selected element is returned to the 
^««pulatlon before the next selection is made, the method of sampling is 



43 

called "sampling with replacement." In this case, the outcome o£ one 
selection does not change the selection probabilities for another 
selection* 

Suppose a sassple of n elements Is ^elected with replacement. Let the 
values of X in the sample be Xj^,X2,. .. where Is the value of X 
obtained on the first selection, x^ the valt«e obtained on the second 
selection, etc. Notice that x^ is a random variable that could be equal 
to any value in the population set of values X^^X^,* • * ,X^, and the prob- 
ability that x^ equals is P^. The san^ stater^nt applies to X2, etc. 
Since the selections are independent, the probability of getting a sample 
of n in a particular order is the product of the selection probabilities 
namely, pCxj^)p(x2) . . .p(x^) where is the for the element selected 

on the first draw, pi*2^ is the for the element selected on the second 
draw, etc. 

Illustration 2.3 . As an illustration, consider a sample of two 
elements selected with equal probability and with replacement from a popu- 
lation of four elements. Suppose the values of some characteristic X for 
the four elements are X^, X^, X^, and X^. There are 16 possibilities: 

Xj^,Xj^ ^2*^1 ''^3*'''l ''^A *^1 
^1*^2 ''^2*^2 ^3*^2 ''^4*^2 

^1*^3 h'h h'h h*^3 

Xj^,X^ ^2*^4 S*^4 ^4*^4 
In this illustration p<Xj^) is always equal to j and P(X2) !« always ^ , 
Hence each of the 16 possibilities has a probability of * ' 



er|c UO 



Each of the 16 possibilities is a different pemutation that could 
be regarded as a separate sample. However, in practice (as we are not 
concerned about which element was selected first or second) it is more 
logical to disregard the order of selection. Hence, as possible saatples 
and the probability of each occurring, we have: 



Sample Probability Sample Probability 

X^»X^ 1/16 X^,K^ 1/8 

Xj,Xj 1/8 XyX^ 1/16 

X^.X^ 1/B X3,X^ 1/8 

XjfX^ 1/16 X,,X 1/16 



Note that the sum of the probabilities is 1. That must always be the 
case if all possible samples have been listed with the correct prob- 
abilities. Also .tote that, since the probability (relative frequency 
of occurrence) of each sample is known, the average for each sample is 
a random variable. In other words, there were 10 possible samples, and 
any one of 10 possible sample averages could have occurred with the 
probability indicated. This is a simple illustration of the fact that 
the sample average satisfies the definition of a random variable. As 
the tlieory of sampling unfolds, we will be examining the properties of 
a sample average that exist as a result of its being a random variable. 

Exercise 2.3 . With reference to Illustration 2.3, suppose the 
probabilities of selection were P, * r» P^ • ^. P* • 4, and P, • -r. 

I ^ i 9 J 9 H H 

Find the probability of each of the ten samples. Remember the saoq^llng 
is with replacement. Check your results by adding the 10 probabilities. 



45 

The sun should be 1. Partial answer: For the sanplc composed of elements 
2 and 4 the probability is (|) (~) + (|-)(|) - j^^ 

2.5 SAMPLING WITHOUT REPLACEMENT 

When a selected element is not returned to the population before the 
next selection is nade, the saispling method is called sampling without 
replacement. In this case» the selection probabilities change from one 
draw to the next; that is, the selections (events) are dependent. 

As above, assume a population of N elements with values of some 
characteristic X equal to Xj^jX^,. . . ,Xjj. Let the selection probabilities 
for the first selection be Pj^ . . ,P^,. . .Pj^ where each ?^>0 and Z?^ » 1. 
Suppose three elements are selected without replacement. Let x^, 
Xj be the values of X obtained on the first, second*, and tH'ird random 
draws, respectively. Wl»at is the probability that - X^, X2 * ^^<^ 
x» • X_? Let P(X-,,X-,X,) represent this probability, which is the prob- 

it 3 0/ 

ability of selecting elements 5, 6, and 7 ii. that order. 

According .0 the multiplicative probability law for dependent events, 
P(X5,Xg,X^) - P(X3)P(Xg|X5)P(X^|X5,Xg) 
It is clear that P(X-) - P.. For the second draw the selection prob- 
abilities (after element 5 is eliminated) must be adjusted so they add 
to 1. Hence, for the second draw the selection probabilities are 

Ik. A, A- A- A. A. That is P(X IX ) - A. . 

P 

Similarly, P(.X^|Xj,X^) - £,p - -p . 

S 6 

P P 

Therefore, P(X3,X^,X^) - (^5^ ^li^ ^ i- ' p^ ' -p ' ^^ ^^'^^ 



ERIC 



I'* J 



46 

P P 

Observe that PCX^.X^.X^) • (^g) (j^^ ^I^^P^^ Hence, P(Xj,Xg,X^) ^ 

P(X^tXjjtXy) unless P^ » P^. In general « each permutation of n elements 
has a different probability of occurrence unl;sss the P^'s are all equal. 
To obtain the exact probability of selecting a sample composed of ele-* 
TOnts Sy 6t and 7t one would need to compute the probability for each of 
the six possible permutations and get the sum of the six probabilities. 

Incidentally t in the actual process *of selection, it is not neces- 
sary to compute a new set of selection probabilities after each selection 
is made* Make each selection in the same way that the first selection 
was made. If an element is selected which has already been drawn, ignore 
the random number and continue the same process of random selection 
until a new element is drawn. 

As indicated by the very brief discussion in this section, the 
theory of sampling without replacement and with unequal probability of 
selection can be very complex. However, books on samplinr, present ways 
of circumventing the complex problems. In fact, it is practical and 
advantageous in many cases to use unequal probability of selection in 
sampling. The probability theory for sampling with equal probability 
of selection and without replacement is relatively simple and will be 
discussed in more detail. 

Ex j erci s e 2.A . For a population of 4 elements there are six possible 

samples of two when sampling without replacement. Let P^ ^2 " 

3 1 

" g-f and P^ ^. List the six possible samples and find the prob- 
ability of getting each sample. Should the probabilities for the six 
samples add to 1? Check your results. 



ERLC 



i>3 



47 



Exercise 2»% , Suppose two elett^nts are selected with repXacenent 
and with equal probability from a population of 100 elenents. Find the 
probiU)ility : (a) that element nufld3er 10 is not selected, (b) that ele- 
ment nunber 10 is selected only once, and (c) that element v^er 10 is 
selected twice? As a check, the three probabilities should add to 1. 
\li\yt Find the probability of selecting the combination of elements 10 
and 20. 

Exercise 2.6 . Refer to Exercise 2.5 and change the specification 
"with replacement*' to 'Vithout replacement." Answer the same questions. 
Why is the probability of getting the combination of elements 10 and 20 
greater than it was in Exercise 2.5? 
2.6 SIMPLE RANDOM SAMPLES 

In practice, nearly all samples are selected without replaceirant. 
Selection of a random sample of n elements, with equal probability and 
without replacement, from a population of N elements is called simple 
random sampling (srs). One element miist be selected at a time, that is, 
n separate random selections are required. 

First, the probability of getting a particular con^ination of n 
elements will be discussed. Refer to Equation (2.1) and the discussion 
preceding it. The F^'s are all equal to ^ for simple random sampling. 
Therefore, Equation (2.1) becomes P<X^,Xg,X^) •« ^ (^Ti^ ^N^^ * 
mutations of the three elements 5, 6, and 7 have the same probability of 
occurrence. There are 3! - 6 possible permutations. Therefore, the 
probability that the aaaple is composed of the elements 5, 6, and 7 is 
N(N"1) (N~2) ' other combination of three elements has the same 

probability of occurrence. ..ikiiiBlf 

ERIC i>l 



48 



In general, all possible conbinations of n elei!«nta have the sane 

chance of selection and any particular combination of n has the following 

probability of being selected: 

(l)(2)0)...(n) . nl(N-n)! 
N(N-l)^N-2)...(N-nfl) N! ^^'^^ 

According to a theorem on nuaber of conbinations, there are *' 



nt(N-n)» 

possible conbinations (sanples) of n elements. If each conbination of 

n elements has the sane chance of being the sample selected, the probability 

of selecting a specified combination must be the reciprocal of the nund^er 

of conl^inations. This checks with Equation (2.2). 

An important feature of srs that will be needed in the chapter on 

expected values is the fact that the J element of the population is as 

likely to be selected at the i'^ random draw as any other. A general 

expression for the probability that the j^^ element of the population Is 

th 

selected at the 1 drawing is 

^ N ^ ^N-2^*** Vl+2' Vi+1^ N 
Let US check t'quatlon 2.3 for 1 • 3. The equation becomes 

The probability that the J element of the population Is selected at the 

third draw is equal to the probability that It was not selected at either 

the first or second draw times the conditional probability of being 

selected at the third draw, given that it was not selected at the first 

or second draw. (Remember, the sampling Is without replacement) . Notice 

that is the probability that the J element is not selected at the 
N-2 

first draw and is the conditional probability that it was not selected 
at the second draw. Therefore, (^^)(^|') is the probability that the j'^ 



49 



element has not been selected prior to the third draw* When the third 

th 

dvw Is made* the conditional probability of selecting the j element 
is ~j . Hence the probability of selecting the j^^ eletMnt ar. the third 
draw is <^)<?^)(7r^) - h • This verifies Equation (2.3) for 1-3. 

To suns&ariice, the gener^.l re«3 4lt for any si2;e of sample is that the 
j^^ element in a population has a probability equal to ut beinf; selected 
at the i^^ drawing. It means that (the value of X obtained at the i^^ 
draw) is a random variable that has a orobability of ^ of being equal to 



any value of t|p set X^,...»X^. 



th 

What probability does the j element have of being included in a 

sample or n? We have just shown that it has a probability of ^ of being 

th 

selected at the 1 drawing. Therefore » any given element of the popula--* 
tlon has n chances, each equal to ^ , of being Included In a sample. The 
element can be selected at the first draw, or the second draw,*.., or the 
n*"^ draw and it cannot be selected twice because the sampling is without 
replacement. Therefore the probabilities, ^ for each of the n draws, can 
be added which gives ^ as the probability of any given element being 
included in the sample. 

Illustration 2.4> Suppose one has a ll^t of 1,000 farms which includes 
some farms that are out-of-scope (not eligible) for a survey. There is no 
way of knowing in advance whether a farm on the list is out-of --scope. A 
simple random sample of 200 farms is selected from the list. All 200 farms 
are visited but only the ones found to be in scope are included in the 
sample. What probability does an In-scope farm have of being in the sam- 
ple? Every farm on the list of 1000 farms has a probability equal to j 



er!c 



50 

of jelng in the sample of 200. All In-scope famp in the sample of 200 
are included in the final sample. Therefore, the answer is 

Exerci se 2,7. From the following set of 12 vaxues of X a srs of 
three elements is to be selected: 2, 10, 5, 8, 1, 15, 7, 8, 13, 4, 6, 
and 2. Find P(x>i2) and P(3<x<12). Remember that the total possible 
number of samples of 3 can readily be obtained by formula. Since every 
poaslble sample of three is equally likely, you can determine which sam- 
ples will have an x<3 or an x>12 without listing all of the numerous 
possible samples. Answer: P(x>12) •» -j— ; P(x<3) » ~ ; P(3<x<12) " 
2.7 SOME EXAffPLES OF RESTRICTED RANDOM SAIIPLING 

There are many methods other than srs that will give «very element 
an equal chance of being in the aanq^le, but some combinations of n ele- 
ments do not have a chance of being the sample selected unless srs is 
used. For example, one might take every k element beplnninp from a 
random starting point between 1 and k. This is called systematic sam- 
pling. For a five percent sample k would be 20. The first element for 
the sample would bt a random number between 1 and 20. If It is 12, then 
elements 12, 32 ♦ 52* etc. > compose the sample. Every element has an 
equal chance, ^ , of belnp in the sample, but there are only 20 com- 
binations of elements that have a chance of being the sample selected. 
Simple random sampling co'ild have given the same sample hut it is the 
method of sampling tha',. characterizes a sample and determines how error 
due to sampling to be estimated. One nay think of sample design as a 
matter of choosing a method of sampling; that is, choosing restrictions 
to place on the process of selecting a sample so the coftibinations which 



b7 

ERIC 



BEST con MMUBLE si 

have! a chance of belnv; the sample selected are generally "better'* than 

j 

many I of the combinations that could occur with simple random sxanpling* 

i 

At t|ie same time, important properties that exist for simple random sam- 

i 

pleslneed to be retained. The key properties of srs will be developed in 

the ijiext two chapters* 

Another common method of sampling involves classification of all 

elements of a population into groups called strata. A sample is selected 

ch 

from each stratum. Suppose elements of the population are in the i 
stratum and a simple random sample of n^ elements is selected from it. 

This Is called stratified random sampling. It is clear that every ele- 

th ^i 
ment in the i strata. i has a probability equal to of ^einp in the 

"^i ^ 
sample. If the sampling? t inaction, zr- , is the same for all strata* 

^ * n^ 

every element of the population has an equal chance, namely .r-- , of 

1 

belnf?; in the sample. A^ain every element of the population has an equal 
chance of selection and of being in the sample selected, but some combi- 
nations that could occur when the method is srs cannot occur when 

stratjified randon samplinp, is used. 

j 

So far, our discussion h-is referred to the selection of individual 
elemints, which are the units that data pertain to. For sampling purposes 

j 

a poj>ulation must be divided into parts which are called saraplinfl; units. 

A satnple of sampling units is then selected. Saraplinr> units and elements 

1 

could be identical. But very often, it is either not possible or not 

[ 

practical to use individual elements as sampling units. For example, 
suppose a sample of households is needed. / list of households does not 
exist but a list of blocks covering the area to be surveyed mif^ht be avail- 
able. In this case, a sample of blocks mlf^.ht be sr'iected and all households 



52 

within the selected blocks included in the sample. The blocks are the 
sampling units and the elements are households. Every element of the 
population should belong to one and only one sampling unit so the list of 
sampling units will account for all elements of the population without 
duplication or omission. Then, the probability of selecting any given 
element is the same as the probability of selecting the sampling unit 
that it belongs to* 

Illustration 2>5. Suppose a population is composed of 1800 dwelling 
units located within 150 well-defined blocks* There are several possible 
sampling plans. A srs of 25 blocks could be selected and every dwelling 
unit in the selected blocks could be included in the sample. In this 
case, the sap>ling fraction is ^ and every dwelling unit has % probability 
of ^ of being in the sample. Is this a srs of dwelling units? No, but 
one could describe the sample as a random sample (or a probability sample) 
of dwelling units and state that every dwelling unit had an equal chance 
of being In the sample. That is, the term ''simple random sample" would 
apply to blocks, not dwelling units. As an alternative sampling plan, if 
there were twelve dwelling units in each of the 150 blocks, a srs of two 
dwelling units could be selected from each block. This scheme, which is an 
example of stratified random sampling, would also give every dwelling unit 
a probability equal to ^ of being in the sample. 

Illustration 2.6. Suppose that a sample is desired of 100 adults 
living in a specified area. A list of adults does not exist, but a list 
of 4,000 dwelling units in the area is available. The proposed sampling 
plan is to select a srs of 100 dwelling units from the list. Then, the 
field staff is to visit the sample dwellings and list all adults living 

mc 59 



53 



In each. Suppose there are 220 adults living in the 100 dwelling units. 
A .^inple random sample of 100 adults is selected from the list of 220. 
Consider the probability that an adult in the population has of being in 
the saople of 100 adults. 

Parenthetically, we should recogni«e that the discussion which 
follows overlooks important practical problems of definition such as the 
definition of a dwelling unit, the definition of an adult, and the defini- 
tion of living in a dwelling unit. However, assume the definitions are 
clear, that the list of dwelling units is complete, that no dwelling is 
on the list more than once, and that no aiid>iguity exists about whether 
an adult lives or does not live in a particular dwelling unit. Incom- 
plete definitions often lead to inexact probabilities or ambiguity that 
gives difficulty in analyzing or interpreting results. The many practical 
problems should be discussed in an applied course on sampling. 

It is clear that the probability of a dwelling unit being in the 
sample is |^ . Therefore, every person on the list of 220 had a chance 
of 1^ of being on the list because, under the specif Ications, a person 
lives in one and only one dwelling unit, and an adult's chance of being 
on the list is the same as that of the dwelling unit he lives in. 

The second phase of sampling involves selecting a simple random 
sample of 100 adults from the list of 220. The conditional probability 
of an adult being in the sample of 100 is ^ - • That is, given the 
fact that an adult is on the list of 220, he now has a chance of |y °^ 
being in the sample of 100. 

Keep in mind that the probability of an event happening is its rela- 
tive frequency in repeated trials. If another sample were selected 



ERIC 



54 

following the above specifications^ each dwelling unit In the population 
would again have a chance of |^ of belnp, In sample; but^ the nuniber of 
adults listed Is not likely to be 220 so the conditional probability aC 
the second phase depends upon the number of dwellings units in the sample 
blocks r Does every adulr have the san^ chance of being in the sample? 
Examine the case carefully* An initial impression could be misleading^ 
Every adult in the population has an equal chance of being listed in the 
first phase and every adult listed has an equal chance of being selected 
at the second phase. But, in tertns of repetition of the whole sampling 
plan each person does not have exactly the same chance of being in the 
sample of 100* The following exercise will help clarify the situation 
and is a good exercise in probability. 

Exercise 2.8. Assume a population of 5 d.u.^s (dwelling units) with 
the following numbers of adults: 

Dwel l lnn Unit Ad ults 

1 2 

2 h 



A srs of two d*u/s is selected. A srs of 2 adults is then selected from 
a list of all adults in the two d*u.*s. Find the probability that a speci- 
fied adult in d.u. No. 1 has of being in the sample* Answer: 0.19. Find 
the probability that an adult in d.u. No. 2 has of being in the sample. 
Does the probability of an adult being in the sample appear to be related 
to the number of adults in his d.u.? In what way? 



55 

An alternative Is to take a constant fraction of the adults listed 
instead of a constant nundjer. For example, the specification might have 
been to select a random sample of y of the adults listed in the first 
phase. In this case, under repeated application of the sampling speci- 
fications, the probability at the second phase does not depend on the 
outcome of the first phase and each adult in the population has an equal 
chance, (^q) (|) " |o ' selected in the sample. Notice that 

under this plan the number of adults In a sample will vary from sample 
to sample; in fact, the itumber of adults in the sample is a random variable. 

For some surveys, interviewinp more than one adult in a dwelling unit 
is inadvisable. Again, suppose the first phase of sampling is to select 
a srs of 100 dwelling units. For the second phase, consider the following: 
When an interviewer completes the llstlnp, of adults In a sample dwelling, 
he is to select one adult, from the list of those Itvinp. in the dwelling, 
at random in accordance with a specified set of Instructions. He then 
interviews the selected adult if available; otherwise, he returns at a 
time when the selected adult is available. What probability does an adult 
living in the area have of being in the sample? According to the multi- 
plication theorem, the answer is P'(D)P(a}d) where P'(D) is the probability 
of the dwelling unit, in which the adult lives, being in the sample and 
P(A|D) Is the probability of the adult being selected given that his 
dwelling is in the sample. More specifically, P'(D) - and P(a|D) - ^ , 
where Is the number of adults-^tn the 1^*^ dwelling. Thus, an adult's 

chance, (I^Xt^). of being in a sample is Inversely proportional to the 
40 k^ 

number of adults in his dwelling unit. 

Exercise 2.9. Suppose there are five dwelling units and 12 persons 
living in the five dwelling units as follows: 



56 

Dwelling Unit Indly tduals 

1 1. 2 

2 3, A. 5. 6 

3 7. 8 
A 9 

5 10, 11, 12 

1. A sample of two dwelling units is selected with equal probability 
and without replacement. All individuals in the •selected dwelling units 
are in the sample. What probability does individual number A have of being 
In the sample? Individual number 9? 

2. Suppose from a list of the twelve individuals that one Individual 
is selected with equal probability,. From the selected individual two 
items of information are obtainec': his age and the value of the dwelling 

in which he lives. Let X^, X^2 represent the ages of the 12 Indi- 

vidualc and let ¥^,...,¥5 represent the values of the five dwelling units. 
Clearly, the ptt,bability of selecting the i^*" individual is ~ and there- 
fore P(X^) - . Find the five probabilities P(y^) p(y^). Do you 

agree that P(Y^^) - |^ ? As a check, j:p(Y^) should equal one. 

3. SuppoHe a sample of two individuals is selected with equal prob- 
ability and without replacen^nt . Ut Y^^ be the value of Y^ obtained at 
the first draw and Y^^ be the value of Y^ obtained at the second draw. 
Does P(Y^^) - P(Y;^j)? niat is, is the probability of gettinp Y^ on the 
second draw the same as it was on the first? If the answer is not evident, 
refer to Section 2,5. 

Exercise JJLQ. A small sample of third-grade students enrolled in 
public schools in a State is desired. The following plan is presented only 



EMC 



^ b3 



57 

as an exercise and without consideration of whether it is a good one: A 
sample of 10 thlrd-Rtade classes I9 to be selected. All students in the 
10 classas will be included in the sample. 

Step 1. Select a srs of 10 school districts. 

Step 2. Wlthi each of the 10 school districts, prepare a list 

of public schools having a third grade. Then select one 
school at random from the list. 
Step 3. For each of the 10 schools resultlnj* from Step 2» list 
the third-grade classes and select one class at random. 
(If there is only one third-grade class in the school, 
it is in the sample). This will give a sample of 10 classes. 
Describe third-r.rade classes in the population which have relatively 
small chances of being selected. Define needed notation and write a 
mathematical expression representing the probability of a third-grade 
class being in the sample. 
2.8 TWO-STAGE SAMPLING 

For various reasons sampling plans often employ two or more stages 
of sampling. For example, a sample of counties might be selected, then 
within each sample county a sample of farms might be selected. 

Units used at the first stage of sampling are usually called primary 
sampling units or psu's. The sampling units at the second stage of sam- 
pling could be called secondary sampling units. However, since there has 
been frequent reference earlier in this chapter to "elements of a popula- 
tion," the sannling units at the second stage will be called elements. 

In the simple case of two-stage sampling, each element of the popu- 
lation is associated with one and only one primary sampling unit. Let i 



58 

be t;.© Index for psu's and lf»t j be the ii.dex for elements within a psu. 

„ .. .th 



Thu» X . represents the value of some characteristic X for the l'^ element 



in the l'^ psu. Also, let 



M «« the total nunber of psu's» 

m = the number of psu's selected for a samole, 

- the total nunber of elenents in the 1^*^ psu, and 
n^ » the number of elements in the sample from the i psu, 



Then, 



M 

ZU, " N, the total number of elements In the population, and 
i 

m 

In » n, the total number of elements in the sample. 
1 

Now consider the probability of an element being selected by a two 
step process: (1) Select one psu, and (2) select one elen^nt within the 
selected psu. Let , 

" the probability of selectinR the 1 psu, 
Pjjj - the conditional probability of selecting the 1*"^ 

ttl til 

element In the i psu given that the i psu has already 
been selected » and 
P^j » the overall probability of selectinp the j^^ eletront in 



jCh 

the i psu* 



Then, 



P •« P P I 
ij i j|i 



If the product of the two probabilities, and Pjj^* is constant for 
every element, then every elemevit of the population has an equal chance of 



Er|c (j5 



BESf GOPl WAIUBl£ 



59 



being selected. In other words, given a set of selection probabilities 

psu's, one could specify that * ^ courpute , 

where P.i, " , so everv element of the population will have an equal 

chance of selection. 

Exercise 2.U. Refer to Table 2.1. An element is to be selected by 

a three-step process as follows: (1) Select one of the Y classes (a row) 

with probabllltv rp* , (2) within the selected row select an X class (a 

N 

column) with probability , (3) within the selected cell select an 

1 • 

element with equal probability. Does each element in the population of N 
elements have an equal probability of beinf* drawn? What is the probability? 

The probability of an element b&inf, included in a two-stage sample 
is given by 

p' m p'p' (2.4) 
1.1 i jU 



where 

P^ « the probability that the i^^ psu is in the sample 
of psu's , and 

Pjj^ - the conditional probability which the j element has 
of being in the sample, given that the i'^ psu has 
been selected. 

The inclusion probability Pjj will be discussed very briefly for three 
important cases: 

(1) Supposi. a random samole of m psu's is selected with equal proU- 
ability and without replacement. The probability, PJ , of the i psu 
being in the sample is f - § where f is the sampling fraction for the 
first-stage units. In the second stage of sampling assume that, within 
each of the m psu's, a constant proportion, f2, of the elements is selected. 



60 

th 

That Is, In the i psu in the sample, a simple random sample of n^ ele- 
ments out of N is selected, the condition being that n. f.M,. Hence, 

I 2 i 

the conditional probability of the J^*^ element in the i^^ psu beinp in 

"i 

the sample is V'^^^ - ~ - f^ . Substituting in Equation 2,4, we have 
^ij * ^1^2 ^^^^^^ shows that an element's probability of beinj? in the 
sample is equal to the product of the sampllnR fractions at the two stages. 
In this case Pj^ is constant and is the overall sampling fraction. 
Unless is the same for all osu's, the size of the sample, 

"i * ^2'''l • ^^^^^^ PS" to psu. Also, since the psu's are selected 

m n 

at random the total size of the sample, n « In = f j:n , is not constant 

I i ^ 

with regard to repetition of the sampling plan. In practice variation in 
the size, n^, of the sanple from psu to psu mifrht be very undesirable. If 
appropriate information is available, it is possible to select psu's with 
probabilUics that will equalize the sample sizes n^ and also keep ?' 
constant. 

N 

(2) Suppose one nsu is selected with probability P ■ ~ . This 

l N 

is commonly known as sanplinp. with pns (probability proportional to size). 
Within the selected nsu, assume that a simple random sample of k elements 
Is selected, (if any are less than k, consolidations could be made so 
ail psu's have an greater than k) . Then, 

p' ■ !!jL d' ^ ^ ~i B' ^i k k 
^ • ♦ ^li - • \i - r - N 

which means that every element of the population has an equal probability, 
, of beinp; included in a sample of k elements , 

Extension of this sampling scheme to a sample of m psu's could 
encounter the complications indicated in Section 2.5. ilowever. It was 



\ 



61 



stated that means exist for circumventing those complications. Sai^llng 

books 1/ discuss this matter quite fully so we will not Include it In this 

monograph. The point is that one can select m psu's without replacement 

in such a way that ro io the probability of Including the 1 psu In 

N. 

the sample. That is, ■ m ^ . If a random sample of k elements is 
selected with equal probability from each of the selected psu's, 

^j'li*^, 

N 

Thus, if the are known exactly for all M psu's in the population, 
and If a list of elements in each psu is available., it Is possible to 
select a two-stage sample of n elements so that k elements for the sample 
come from each of m psu's and every element of the population has an equal 
chance of being in the sample. In practice, however, one usually finds 
one of two situations: (a) there is no information on the number of ele- 
ments in the psu's, or (b) the information that does exist is out-of-date. 
Nevertheless, out-of-date information on number of elements in the psu's 
can be very useful. It is also possible that a measure of siase might 
exist which will serve, more efficiently, the purposes of sampling. 

(3) Suppose that characteristic Y is used as a measure of sl«e. Let 

Y be the value of Y for the i^^ psu in the population and let - y" 
^ M 



where Y - EY . A sample of m psu's is selected in such a way that 
i 

Y 

p- » „ -J. is the probability that the i'^ psu has of being in the sample. 
1 Y 



\l For example, Hansen, Hurwitz, and Madow. San^le Survey Methods and 
Theory. Volume I, Chapter 8, John Wiley and Sons. 1953. 

RJC ^ 



% 



62 



With regard to the second stage of sampling, let f^^ be the sampllnf' 
fraction for selecting a simple random sample within the l^^ psu in the 
sample. That Is, Pj|^ * f^^ . Then. 

Pjj - (tnji)(f2^) (2.5) 

In setting sampling specifications one would decide on a fixed value 
for PJj. In this context PJ^ is the overall sampling fraction or propor- 
tion of the population that Is to be Included in 'the sample. For example, 
If one wanted a 5 percent sample. P^^ would be .05. Or. If one knew there 
were approximately 50.000 elements In the population and wanted a sample 
of about 2.000, he would set P^'^ • .04. Hence, we will let f be the over- 
all sampling fraction and set equal to f. Decisions are also nade on 
the measure of size to be used and on the number, m, of nsu's to be selected. 
In Equation 2.5, this leaves f-. to bs determined. Thus, f is computed 
as follows for each nsu In the sample: 

fY 



■» - 

21 .Y^ 



Use of the sampling fractions f^^ at the second stage of sanollng will give 
every element of the population a probability equal to f of being in the 
sample. A sample wherein every -lenent of the population has an equal 
chance of inclusion Is often called a self-wel^hte-? sample. 



o 

ERIC 



BEST COPY AVAIUBli 



63 



"ffAPTER HI. !:;<ri;cTED valu':s ny random variabixs 

3.1 TNTRODUCTION 

The theory of cxnected values of randon variables Is used exten- 
sively In the theorv of n.nppUn".; i" "^^ct, it is the fnundntion for 
sannlinp theory. Interpretations of the accuracv of estimates from 
prob.ibilitv sanples depend heaviW on the theorv of expected values. 

The definition of a random variable was discussed in the previous 
chapter. It is a variable that can take (be equal to) anv one of a 
defined set of values t.'ith kncn^n probabilttv. Let be the value of X 
for the i^^^ elenent in a set of N eIerH?nts and let be the nrohabilir.v 
that the i^'' oler'.Mit has of hein". selected by some chance operation so 
that is known a priori. \.1iat is the expected value of X? 

Uefinition 3.1. The exnectod value of a randon variable X is 



■■ r X <'hcrc r r »1. The rnthenatic.tl notation for the expected value 
i^l '' " i = l ' 

N 

of X is i:(X). Hence, bv definition, r.(X) = '- P.X . 

i«l 

()bst?rve that M'^X^ is a weir.iited average of tlu> valufs of X, the 
weights bein«- the probabjlitlcs of selection. "i:xr,ected value" is a 
substitute expression for "avera-e value." In other words, r nieans "the 
average "alue of" or "find the averapc value of" whatever follows \:.. For 
exanple, L(X^), read "the exnected value of X-'," refers to the averar.e value 
of the souarcsof the v-»lues that X can eqt.al. Ti»at is, bv definition, 

i=l 

If all of the N clenents !iavc an cnual chance of beinr. selected, all 
values of wust equal J- because of the requirement that WV ^ = 1. In 



64 

N . EX, 

this case, K(X) » E X » =• s v .^m^K ^ , 

/,N i N ^ ' which Is the simple average of X 

for all :j elements. 

lilwstxation JJL. Assujna 12 elements having values of X as follows: 



h 




3 


S 




5 


• 10 






m 


9 


\ 


m 


3 








9. 


3 


h 


S9 


U 








S3 


5 






3 


^12"^ 




E(X) 


m 


3+9+. . 
12 


.+4 


5, 


assuming each element has the same 





chance of Jselect ion . Or. by countinp, the number of times that each 
unique v>lue of X occurs, a frequency distribution of X can be obtained 
as follows: 



3 5 

A 2 

5 2 

S " 1 

9 1 

10 I 



where is a unique value of X and Is the number of times X^ occurs. 
We noted in Chapter I that EN = N, IN X - EX.. and that . -^1 « x . 

J J ^ j ^ * 

Suppose one of the values is selected at random with a probability equal 

N 

to where « ^O.. « _J. . ^ji^gj. j.^^ expected value of X ? Bv 



ERIC 



65 



definition ECX^) - SP^X^ - - - X . The student may verify 

that in this illustration E(X^) • 5. Note that the selection snecifica- 
tions were equivalent to selecting one of the 12 elements at random with 

equal probability. 

Incidentally, a frequency distribution and a probability distribution 
are very similar. The probability distribution with reference to X^ would 
be: 

A. A. 



3 5/12 
A 2/12 
5 2/12 

8 1/12 

9 1/12 
10 1/12 

The 12 values, P • - , for the 12 elements are also a rrobablHcv distrl- 
bution. This Illustration shows two ways of treatinp. the set of 12 
elements. 

When finding expected values be sure that you understand the defini- 
tion of the set of values that the random variable mir,ht equal and the 
probabilities involved. 

Definition 3. 2. \^en X is a random variable, by definition the 
expected value of a function of X is 

N 

Elf(X)] - I P.[f(X )) 
i-1 ^ 

2 

Some examples of simple functions of X are: f(X) - aX, f(X) « X , 
f (X) - a + bX + cX^, and f(X) - (X-X)^ . For each value, X^ , in a 
defined set there is a corresponding value of f(X^). 

Er|c 7'^ 



/ 



66 



Suppose ((A) » 2X+3. Uith reference to the set 

of 12 elements discussed above, there are 12 values of f(X^) as follows: 

f(Xj,) « (2) (3) + 3 « 9 

f(X^) = (2) (9) + 3 « 21 



f<X^2> ' 2(4) + 3 « 11 
Assuninp, = ;j the expected value of f(X) » 2X+3 would be 

12 



ERIC 



t(2XH.3) = f i.(2X^-h3) « c|j)(9)-h(i^)(21)+...+(ij)(U) « n (3.1) 

In algebraic terms, for f(X) « a:<-fb , we have 

N 

E(aX+b) « Z P (aX,+b) « IP,(aX,) + IP b 
i=l ^ 1 i i i 

By definition IP^(aX^) - E(aX) , and IP^b « E(b) . Therefore, 

E(aX4-b) = };(aX) + E(b) ^3 2) 

Since b is constant and - 1 , ZV^b « b, which leads to the first 
important theorem in expected values. 

JhSPySJ^.^d.- Tl^e expected value of a constant is equal to the 
constant: E(a) » a. 

By d.'flnition E(aX) = i:P^(aX^) - aZP^X^. Since EP^X^ « E(X) . we have 
another innortant theorem: 

TheoreniJ^^- T^ie expected value of a constant times a variable equals 
the constant times the expected value of the variable; E(aX) « aE(X). 

Applyinr, these two theorems to Equation (3.2) we have E(aX-fb) » 
aE(X) -f b. Therefore, with reference to Illustration 3.2, !■ (2X4-3) » 
2E(X) + 3 =- 2(5) + 3 « 13, which Is the sine as the result found In 
(5*^uation (3.1). 



\ 



67 



Exercise 3._1. Suppose a random varlAble X can t.-.ke any of the 
foilowlnp four values with the probabilities indicated: 
= 2 X^ « 5 Xj « 4 X^ » 6 

» 2/6 " ^3 " " 

(a) Find t(X) Answer: 4 

^ 1 2 2 

(b) Find E(X^) Answer: 18--. Note that E(X ) 4 {E<X)1 

(c) Find E(X~X) Answer: 0 Note: By definition 



E<X-X) « I P.(X -X) 
i«l 



- 2 1 
(d) Find E(X-X) Aiiswer: 2y Uote: liv definition 



- 2 ^ - 2 

E(x-x)'^ » p,(x -X) 

Exercise 3.2. From tiie foUowinR set of three values of one 
v.Tltie i<; to be selected with a probabilitv Pj": 
Vj^ - -2 " ^ = A 

^1 " ^^"^ ^2 " '^'^ ^3 ° ^''^ 

(a) Find E(Y) Answer: 1~: 

(b) Find E(|-) Answer: 3/16. Note: ^ K(~) 

.7 3 

(c) Find E(Y-V)" Answer: 4^ 

3.2 EXPECTED VALUE OF THE SUM OF TWO RANDOM VARTAfl":S 

The sun of two or more random variables is also a random variable. 
If X and Y are two random variables, the e.-Tected value of X + Y is equal 
to the expected value of X plus the expected value of Y:E(X+Y) - E(X)+E(Y). 
Two numerical illustrations will help clarlfv the situation. 

Uli'stration J^3. Consider the two random variables X and Y in 
^ Exercises 3.1 and 3.2: 

ERIC 



68 

X » 6 P a i 

A 4 6 

Suppose one element of the first set and one element of the second 
sec are selected with probabilities as listed above. UTiat Is the expected 
value of X + Y? The joint probability of getting. X and Y is P P' because 
the two selections are independent. Hence by d.flnltion 

4 3 

K(X + Y) - S I P p; (X + y ) (3. 
i-1 .1-1 ^ J ^ J 

The possible values of X + Y and the probability of each are as follows: 
X + Y P^P' 



+ Y^ » 4 P P' « 



h ^ V3 - 6 P P' « 



X^ ^ Y^ » 3 P . 





X 


f y 






P P ' 




2 

24 


"3 






2 


P p' a 
3 1 


1 

24 


4 

24 


h 


*^2 


IBS 


6 


P P' « 

3 2 


2 

2~4 


2 

24 


h 




«9 


8 


p p' » 
3 3 


1 

24 


2 

24 


\ 




m 


4 


P P' 0 
4 1 


1 

24 


4 

24 


\ 


*^2 




8 


P P' " 

4' 2 


2 

24 


2 

24 


\ 






10 


p p ' « 

4' 3 


1 

24 



^2 - 7 P.P^' = 

X-fY «»9 pp'« - - 
'^2 ^ '3 ^ '2*3 24 

As a clieck the sum of the probabilities nust be I if . all possible 

sums have been listed and the probahilltv of each has been correctly 

deterroined. Sub^tltutlnp the values of + Y^ and P^PJ in Enuatlnn (3.3) 

we obtain 5.5 as follows for exnected value of X + Y: 

(f^)(0) + (™-)(4) + ... + (-^-^XIO) « 5.5 



ERIC 



^ 9 \J 



BEST m AVWUBLE 



69 



From Exercises 3.1 and 3.2 wc hnve E(X) - A and n(Y) « 1.5. There- 
fore, E(X) + E(Y) " 4 1.5 « 5.5 which verifies the earlier statenent 
that E(X + Y) - E(X) + E(Y). 

I Host rat ton 3 . A . Sunpose a r-indon sample of two is selected with 
replacement from the population of four elements used in Exercise 3.1. 
Let X be the first value selected and let x be the second. Then and 
x^ are random variables anr* x + x^ is a random variable. The possible 
values of x^ and the probahilltv of each, r(x^,X2),are listed below. 

Notice that each possible order of selection is treated separately. 









v:i2 














4/36 


4 


^3 




2/36 


6 


h 


h 


A/ 36 


7 


^3 




2/36 


9 


h 




2/36 


6 


h 


■■3 


1/36 


8 


h 


h 


2/36 


a 


h 


h 


1/36 


in 




h 


A/36 


7 


h 


h 


2/36 


8 


h 




A/36 


10 


■\ 


h 


2/36 


11 


h 




2/36 


9 


h 


h 


1/36 


10 




h 


2/36 


11 


h 


h 


1/36 


12 



By definition E(xj^ + x^) is 

fj(4)4|j(7)+§j(6) + ...+i^(12)=8 

In Exercise 3.1 we found E(X) « A. Since x^ is the same random variable 
as X, E(x,) » A. Also, x, is the same random variable as X, and E(x2) " A. 
Therefore, E(x^) + E(x2) » 8, which verifies that ECx^-fx^) » E(xp + ECx^). 

In general If X and Y are two random variables, where X ralp,ht equal 
X^,...,Xjj and Y might equal Y^ Y,^. then E(X + Y) « E(X)+E(Y). The 

ErJc Vo 



70 

NM 

proof Is as follows { By definition E(X+Y) « ZZ P< < (X.+Y ) where P Is 

iJ i j ij 

the probability of gettinR the sura + Y^, and EIP^^ - 1. The double 
suranation is over all possible values of P^^CX^+Y^). According to 
the rules for summation we tnav write 

NM NM m 

In the first term on the rip,ht, is constant with regard to the sunanation 
over j; and in the second term on the right, Y^ is constant with regard 
to the sunmation over i. Therefore, the ripht-hand side of Equation (3. A) 
can be written as 

N M M N 

z X. P., + S V, Z P^^ 

M N 
And, since Z P^^ •» P^ and I P^^ » P^ , Equation (3.4) becomes 

NM N M 

ZZ P. .(x.+Y J » z x.p, + y,p, 

ij Ij i 1 J i i ^ .1 J 

N M 
JJv definition Z X P - E(X) and Z Y P « t(Y) . 

Therefore K(X+Y) = E(X) + E(Y) . 

If the proof is not clear write the values of ^ijC^^+Y^) in a matrix 
format. Then, follow the summation manipulations in the proof. 

The above result extends to any number of random variables; that is, 
the expected value of a sum of random variables is the sum of the expected 
values of each. In fact, there is a very Important theorem that applies 
to a linear combination of random variables. 



ERIC 



71 



Theorem 3«3 . Let u " a^^u^ W* where Uj, ,...,Uj^ are random 

variables anH aj^,...,a^ are constants. Then 

E(u) " aj^E(u^) +...+ si^ 
or in summation notation 

k k 
E(u) "EE a.u - Z a E(u ) 
^11 i ^ 

The generality of Theorem 3.3 is impressive. For example, with refer- 
ence to sampling from a population Xj^,,.., X^j, u^^ miRht be the value of X 
obtained at the first draw, U2 the value obtained at the second draw, etc. 
The constants could be weights. Thus, in this case, u would be a weighted 
average of the sample measurements. Or, suppose Xj^ ,. . . are averages 
from a random sample for k different age groups. The averages are random 
variables and the theorem could be applied to any linear combination of the 
' averages. In fact u^ could be any function of random variables. That is, 

■'v 

the only condition on which the theorem is based is that lij-^jSust be a 
random variable. 

Illustration 3.5 . Suppose we want to find the expected valvie of 
(X Y)^ where X and Y are random variables. Before Theorem 3.3 can be 
applied we must square (X + Y). Thus E(X -f Y)*^ - E(X'^ + 2XY + Y^) . 

The application of Theorem 3.3 gives E(X + Y)^ » E(X)^ + 2E(XY) + E(Y) 
Illust ration 3.6. We will now show that 

E(X-X)(Y-Y) - E(XY) - XY where E(X) - X and E(Y) • Y 
Since (X-X)(Y-Y) - XY-XY-XY + JCY we have 

E(X-X)(Y-Y) - E(XY-XY-XY+XY) 
and application of Theorem 3.3 gives 

E(X-X)(Y-Y) • E(XY) - E(XY) - E(YX) + E(XY) 



ERIC / ^ 



72 

Since X and Y are constant. E(XY) - X E(Y) - XY. E(YX) - YX, and E(XY) - XY. 
Therefore, E(X-X)(Y-Y) - E(XY) - XY 

Exer cise 3.3. Suppose E(X) - 6 and ECY) « 4. Find 

(a) E(2X+4Y) Answer: 28 

(b) (E(2X)]^ Answer: 144 

(c) /E(Y) Answer: 2 

(d) E(5Y-X) Answer: 14 

Exercise 3.4. Prove the followinR, assuralni; E(X) X and E(Y) - Y: 

(a) E(X-X) » 0 

(b) E(aX-bY) + cE(Y) » aX + (c-b)Y 

(c) E[a(X-X) + b(Y-Y)] » 0 

(d) E(X+a>^ - E(X^) + 2aX + a^ 

(e) E(X-X)^ - E(X^) - X^ 

(f) E(aX+bY) - 0 for any values of a and b if E(X) - 0 and E(Y) * 0. 
3.3 EXPECTED VALUE OF AN ESTIIIATE 

Theorem 3,3 will now be used to find the expected value of the mean 
of a simple ranHom sample of n elements selected without replacement from 
a population of N elements. The term "simple random sample" Implies equal 
probability of selection without replacement. The sample average is 

x,+. . ,+x 

.1 . n 

X ■ — 

n 

where x^ is the value of X for the i^^ element in the sample. Without 
loss of Renerality, we can consider the subscript of x as corresponding 
to the i'*' draw; i.e., x^^ is the value of X obtained on the first draw, 
^2 the value on the second, etc. As each x^ is a random variable, x 
is a linear combination of random variables. Therefore, Theorem 3.3 
applies and 



ERIC vri 



E(x) • ~ lE(X-) E(x^)] 
n 1 n 

In the previous chapter, Section 2.6, we found that any Riven element of 
the population had a chance of ^ of beinR selected on the l'^ draw. 
This means that Is a random variable that has a probability equal to ~ 
of being equal to any value of the population set X^,...,X^. Therefore, 

E(Xj^) - E(X2) «... - E(x^) - X 
Hence, E(x) - .. t .,. / , v .. t X. . jj j^^t tl^at E(x)"« X is one of the very 

important properties of an average from a simple random sample. Inciden- 
tally, E(x) - X whether the sampling Is with or without replacement. 

Definit ion 3.3. A parameter is a quantity computed from all values 
in a population set. The total of X, the average of X, the proportion of 
elements for which X^<A, or any other quantity computed from measurements 
Including all elements of the population Is a parameter. The numerical 
value of a parameter is usually unknown but it exists by definition. 

£®fiilitA<?A-3.^' An estimator Is a mathematical formula or rule for 
making an estimate from a sample. The formula for a sample average, 

x , is a simple example of an estimator. It provides an estimate of 

EX, 

the parameter X ■ ♦ 

Deflnltipn 3.5 . An estlnate is unbiased when its expected value 
equals the paraineter that it is an estimate of. In the above example ^ x 
is an unbiased estimate of X because E(x) - X. 

Exercise S."*. Assume a population of only four elements having values 
of X as follows: Xj^ " 2 , X2 • 5 , X^ • 4 , X^ - 6. For simple random samples 
of size 2 show that the estimator Nx provides an unbiased estimate of the 
population total, ZX^ - 17. List all six possible samples of two and 

RJC 



74 



calculate Nx for each. This will r,ive the set of values that the random 
variable Nx can be equal to, Cnnf^ider the probabilltv of each of the 
possible values of Nx an<! show nri tlinetical ly tlint K(Nx) ^ 17* 

A sannle of elenonts frnn a ponulntlon is not alwnvs selected bv 
usinp, equal probabilities of selection, Samnlinj^ with unequal prohabllitv 
Is complicated when the sanplin^^ Is without replacetnc>nt, so we will limit 
our dlpcusslon to sapplinr, with replacement. 

Il lustration 3>7> The set of four elenents and the associated prob- 
abilities used in «:xercise 3.1 will serve as an exannle of unbiased 
estimation when sai^ples of t.;o elements arc selected with unequal prob^ 
ability and with repl.icenent . Our estimator of the population total, 

n X 

I S 

i»l ^1 

« 17, will be x' 5^ — — • The estimate x' Is a randon variable. 

n 

Listed below are the set of vnlues that x' can equal and thv probabilicv 
of each value occurrin^r. 













6 


4/36 




^2 


10.5 


8/36 






15 


4/36 


^ 


\ 


21 


4/36 






15 


4/36 




^3 


19.5 


4/36 


'^2 




25.5 


4/36 


X3 


^3 


2U 


1/36 






•JO 


2/36 






36 


V36 



er|c t'n 



75 



:RJC 



A'-A* Verify the above values ^^^^ 

expected value of x'. By definition nCx') " TP^xj, Your answer should 

be 17 because x' is an unbiased estimate of the population total* 

To put sanplinfT with replncenent and unequal probabilities in a 

general settinp, assume the population is • • • • and the selec-- 

tion probabilities are P, »• • • fP, ♦ • • • .Pm* x. be the value of X for 

1 j N i 

the i**^ clctnent in a sannle of n elements and let p^ be the probability 

n X 

i«l ^i 

which that element had of being selected. Then x' « — is an unbiased 

' n 

N 

estimate of the population total. We will now show that E(x^) « S X • 

.1=1 ^ 

To facilitate comparison of x^ with u in Theorem 3.}, x' may be 
written as follows: 

• X • « X 

« ^...^ k^) 

n n P^ 

It is now clear that a, » ■ and u. » . Therefore^ I 

1 n i p^ j 

E(x') « (1-5) 
n p, P 
1 n 

"^1 

The quantity , which is the outcome of the first random selection from 

the population, is a randon variable that mi^ht be equal to any one of the 

^ X Xj^ X 

set of values t . . • ♦ 5^ • . ♦ 5^ • The probability that — equals p^'^ is P. 

U 1 ^1 ^ 

Therefore, by definition 

Since the s.impUnr, is with replaccncnt it is clear that anv is the sr'o 
random variable as — . 

9^. ^ H'Z 



76 

Therefore Equation (3.5) becones 

- N N 

Vl ^ ,1 ^ 

Sin;v •.♦n're are n terms in the series it follot^s that 

N 

.1 ^ 

Exerc ! se 3> 7 > As a corollarv show that the expected value of — is 

n 

equal to the population mean. 

ay this time, you should be getting familiar with the idea that an 
estimate fron a probabilit;- sample is a random variable. Persons respon- 
sible for the design and selection of samples and for makinf: estimates 
from samples are concerned about the set of values, and associated 
probabilities, that an estimate from a sample mipht be equal to. 

Definition 3.fi > The di^'tribution of an estimate p,enerat^d by prob- 
ability sampling. Is the samplinj^ distribution of the estimate. 

The values -^f xJ and in the numerical Illustration 3.7 are an 
example of a samplinR distribution. Statisticians are primarily inter- 
ested in three characteristics of a samr^lin^ distribution: (1) the mean 
(center) of the samplinr distribution in relation to the value of the 
parameter belnp, estirated, (2) a measure of the variation of possible 
values of an estimate from the mean of the sampling distribution, and 
(3) the shape of the samplinr distribution. We have been discussing the 
first. \^on the expected value ^f an estimate equals the parameter beinp 
estimated, we know that the mean of the sampling distribution is t:?ual to 
the parameter estimated. But, in practice, values of parameters are 
p:ener ily not known. To jud^e the accuracy of an estim<ste, we need 



77 



infomatlon on all three charact, rlstlcs of the sampling distrihution. 
Let us turn now to the generally accented measure of variation of a randon 
variable. 

3,4 VARIANCE OF A RANDOM VARIABIX 

Tlie variance of a random variable, X, is the avera^.e value of the squares 
of the deviation of X fron its mean; that is, the average value of (X-X) . 
The square root of the variance is the standard deviation (error) of the 
variable » 

pe fin it ion 3 . In terms of expected valu/es^ the variance of a random 

variable, \ is E(X-X) where KCX) ^ X. Since X is a random variable, 
- 2 

(X-X) is a random variable and by definition of expected value, 

^ 2 ^ ^2 

' t:(x-x)^ « I p (X -X) 

1 ^ 

In case » ^ we have the more faniliar formula for variance, namely, 

^ 2 

E(X-X)^ « — ,7 « al 

A A 

2 2 2 2 

Commonly used symbols for variance ^include : a , o , V ^ S , Var(X) 

and V(X). Variance is often defined as — . This will be discussed 

In Section 3.7. 

3.4.1 VARIANCE OF TH£ SUM OF WO INDEPENDENT RANDOM VARIABLES 

Two randon variables, X and are independent if the .ioint probability, 

^^y of getting X^ and is equal to (P^)(Pj), where is the probability 

of selecting X^ from the set of values of X, and P^ is the probabilitv of 

selecting from the set of values of Y. The variance of the sum of two 

independent random variables is the sum of the variance of eacn. Tiiat is, 

t 

ERIC 



78 

lU-H'*^.^!'^^.^.^^!" ^^^^i Illustration 3.3, X and Y were independent. We 

licid Htited all possible values of ^^^^ ^ ^^nd the probability of each. From 
tuat iistinp, we can readily compute tlie variance of X+Y. liv definition 

^x^Y " ^l(>^'^v)-(x+Y)]2- ::z PjP^[(x^+y^)-(x+y)]^ (3.6) 

Substi tutinj: in Kquation (3.6) wo have 

4+Y = y^^--^-'^^ ^ y^-'^.Sr 4-...+ 1^(10-5.5)2 « ^ 

Tilti variancciH of X and Y are computed as follows: 

1 ^ 11 '^1 '71 17 

a: « i;(X~X)*- «= -H2-/0^ + 7(5- + H4~4)- + -kG-A)" = 4 
A i f) u 6 3 

- i:(Y-Y)^ = ^(-2-1,5)^ 4- -^(2-U5)^ -f ^(A-l.S)*- « ~ 

Wl now liavt: a" 4 o « - wlitcli verifies the above statement that 

A Y 3 12 

the variance of the sum of luo independent random variables is the sun of 
the variances, 

i:xerclse 3,8. Prove that f;[ (X+Y)-(X+Y) )^ « E(X-fY)^ - (X-hY)^. Then 

calculate tiie variance of X+Y in Illustration 3.3 by usinn the formula 

2 2 2 

^X+Y *'(X+Y)" - (X-f'Y) . The answer should ajrree with the result obtained 

In 1 1 lust rat Ion !• 8. 

iixerc ise 3^.9. Refer to Illustration 3.3 and the listinf: of possible 
values of X Y and the probability of each. Instead of X^+V^ list the 
products (X^-X)(Y^-Y) and show that i:(X^-^X) (Y^-Y) « 0. 

E>ysrcise J^JO. Find E(X-X)(Y-Y) for the numerical example used in 
Illustration 3.3 hv the formula K(XY) XY which was derived in Illustra- 
tion 3-6, 



ERLC 



79 

t 

3. A. 2 VARIANCE OF THK SU^I OF TWO u.^'^ENDENT RANDOM VARIABLES 

The variance of dependent random variables involves covariance which 

is defined as follows: 

DefJjii_tion^3.^. The covariance of two random variables, X and Y, Is 

E(X-X)(Y-Y) where E(X) - X and E(Y) » Y. By definition of expected value 

E(X-X)(Y-Y) - ZL P. ,(X -X)(Y -Y) 
ij ^ 

where the sunnnation is over all possible values of X and Y. 

Symbols conanonly used for covariance are a^, S^^ , and Cov(X,Y). 

Since (X+Y) - (X+Y) « (X-X) + (Y-Y) we can derive a formula for the 
variance of X+Y as follows: 

oj^^ - Eft X+Y) - (X+Y))^ 
« E[(X-X) + (Y-Y))^ 
« Et(X-X)^ + (Y~Y)^ + 2(X~X)(Y-Y)1 
Then, according', to Theorem 3.3, 

oLv " E(X-X)^ + E(Y-Y)^ f 2E(X-X)(Y-Y) 
and by d-finition we obtain, 

°X+Y - 4 ''y 2^XY 
Sometimes is used Instead of to represent variance. Thus 

4+Y " ^Oi ''yY ^ 20 
For two .idependent random variables, P^^ - ^^^y Therefore 

E(X-X)(Y-Y) « Zl P P (X -X)(Y ~Y) 

iJ ' ^ 

Write cut in lonf^an.', If necessary, and be satisfied that the following 
is correct : 

er|c 



\ 



80 



ZZ P.P.{X -X)(Y -Y) » SP.(X -X)r,r.(Y -Y) - 0 (3.7) 
i j i i i i i j j .1 

which proves that the cavarlance is zero when X and Y are independent* 

A 1 

Notice that in Equation (3.7) IP (X -X) « E(X-X) and IP <Y -Y) « E(y-Y) 

i j 

which, for independent randpm variables, proves that li(X-X)(Y-Y) « 

E(X-X) E(Y-Y). When v;^rkinR with independent random variables the following 

important theorem is frequently very useful: 

Theorem 3> A> The expected value of the product of in d epende nt random 
variables , U2f u^^ is the product of their expected values; 

E(u^U2* • tUj^) » E(u^)E(u2) • • -ECu^) 
3.5 VARIANCE OF AN ESTIMATE 

; The variance of an estimate from a probability sample depends upon 
the method of sampling. We will derive the formula for the variance of x, 
the mean of a random santple selected with equal probability, with and 
without replacement. Then, the variance of an estimate of the population 
total will be derived for samplinR with replacement and unequal probability 
of selection. 

3.5.1 EQUAL PROHABILITY OF SELECTION 

Tlie variance of the mean of a randrm sample of n elements selected 

with equal f^^^babllit ies and wi th replacement from a population of N, is: 

^ - 2 
S(X^-X)'' 

Var(x) « — - , where 0^ = 

n AM 

The proof follows: 

_ _ - 2 _ 

By Jefinirion, Var(x) « E[x-H:(x)] . We hnve shown that ECx) » X. Therefore, 

- - 2 

Var(x) = E(x-X) . iiy substitution and algebraic manipulation, we obtain 



ERIC 



81 



ERIC 



Var(x) - £[ ^ ^ ■ " - X)^ 

(x,-X)+. . .+(x -X) , 

- E[ ^ ] 

- ^ Et E(x,-X)^ + Z Z(x,-X)(x -X)]. 

Applying Theorem 3.3 we now obtain 

Var(x) " ^ [ EECx -X)^ + I ZE(x -X) (x -X) ] (3. 
n i-1 i^j 

In scries form, Equation (3.8) can be written as 

Var(x)- [E(x^-X)^ + E(x2-X)^ +...+ E(Xj^-X) (X2~X) + E(x^-X) (x^-X)+. . . 
n 

Since the sampling is with replacement x^ and x^ are independent aid 

the expected value of all of the product terras is zero. For example, 

E(.Xj^-X) (x^-X) •» E(Xj^-X) ECx^-X) and we know that E(x^-X) and ECx^-X) are 

- 2 

zero. Next, consider E(Xj^-X) . We have already shown that x^ i^ a 
random variable that can be equal to any one of the population set of 
values X^,...,Xj^ with equal probability. Therefore 

- 2 
ECX -X) 

E(x^-X) -J-jj 0^ 

The same argument applies to Ti^* Therefore , 

n 2 2 2 2 - 

Z E(x---X) • 0^ a» » na^ and Equation (3.8) reduces to Var(x) ^ — . 

The mathematics for finding the variance of x when the sanplinR is 
without rcplaceiTKjn^ is the same as sampling with replacement down to and 
including Equation (3.8). Tne expected value of a product tenn in Equation 
(3.8) is not zero because x^ and x^ are not independent. For example, on 



f?2 

the first Ur.iw .m element hns a nrobabilitv oC ~ ol beinr. selected, but 
on the second drnw the nrobabilitv l.q conditioned by the fact that the 
clenent selected on the first draw was not replaced. Consider the first 
product term in Kqualion (3.8). To find E(x^--X) (k^^X) we need to consider 
the set of values that (x -X)(x<,"X) could be equal to. Reference to the 
followinr jnntrix is helpful; 

(X^-X)^ (X^-XXX^-X) ... (Xj^-X)(Xj^-X) 

(X^-X)(X^-X) (\^'')lr ... (X2-X)(X^-X) 



(X^-X)(X^~X) (X.,-X)CX2~X) ... (X^.-X)^ 

The random v.iriable (Xj^-X) (x^~X) iias an equal proha!)ilitv of beinp, any of 
the products In the above matrix, except for the squared terns on the nain 
di.ir.nnnl. Tljere arc sucii products. Tlierefore, 

I r, (x.-x)(x^~x) 

i:(x^-X)(x2-X) = '"^ ■ 'N5(>j_i) ^ 

According, to Hquation (1.9) in Cluipter 1, • 

N N' ^ 

I 7. (\ -X)(X^-X) = - I (X -X)- 

hence, ^. 

r(X,-X)^ 2 
- r i X 

h(x^-X) (x^-X) = - -y^fTTfr' " " N^r 

The sane evaluation applies to all other product terms in Equation (3.8). 
There are r(n-l) product terras In Equation (3.8) and the expected value of 
2 

"x 

each is - ~y • Thus, Equation (3.8) becomes 
» ~ 1 

ERIC 



BEST tm AVAILABLE S3 

2 

1 " - 2 °X 

n i 

Recof.nizlnp, that E(x -X) « and after sonie easv aUobrnlc operations 

the answer as follows Is obtained: 

2 

VarU) « ^ ^ (3.9) 
N-1 n 

The factor is called the correction far finite population because it 

does not apPtMr when infinite populations are Involved or when sannlinr 

with replacement which Is equivalent to sampling from an infinite population. 

For two characteristics, X and V, of elements in the same simnle random 
sample, the cavariance of x and v is Riven by a fonnula analogous to 
Equation (3-^0; namely, 

3.5.2 UNi:Ot'M, PRHBAniLITY Of Srf.ECTinN 

n X. 
1 

In Section i. 3 we proved that x' = — — is an unbiased estimate 

n 

of the population total. This was for samnlinp. with replacement and 
unequal probability of selection. We will now proceed to find the vari- 
ance of x' . 

Bv definition Var(x') = E[x'- E(x')]" . Let X = T X . Then since 

1 

nCx') » X, it follows that 

X, X • ' 

JL 4. + 

X X 

Var(x') « EI--^ — - X}^ = ^ -f(-^ - X)-f... + (-^^ - X)]^ 



n^ ^ 



^ - X)'^ + E (~- - X)(~^ ~ X) 1 

n^ ^'1 l^k ''l 



ERIC 



84 

Applying Tlieorew 3.3, Var(x') becomes 

Var(x') - ^ [ZK(-^ - X)' + Z EeA - X) (~^^^ - X)) (3.11) 

Notice the similarity of Equations (3.fi) and (3.11) and that the steps 
leading to these two equations t ere the sane. Again, since the* sampling 
is with replacent it, the expected value of all product terms in Equation 
(3.11) is zero. Therefore Enuation (3.11) hecones 

1 " ^i 2 
Var(x') - ~ [Z K(- - X)*^] 

n 1 i 

X NX 
By definition E(-^- - X)*^ - T. P^C^r - X)^ 

N X. 2 

Z P.(r^ - X)^ 

i 1 

Therefore Var(x') « (3.12) 

£^®fj^^^.„l*JJL* Refer to Exorcise 3.1 and compute the variance 
of x' for samples of two (that is, n » 2) usinR Equation (3.12). (b) Then 
tuf. to Illustration 3.7 and compute the variance of x' from the actual 
values of x'. Don't overlook : le fact that the values of x' have unequal 
prob-'jiiities. According to Definition 3.7, the variance of x' is 
10 2 

r < - X) where X « E(x'), x' is one of the 10 nosaible values of x', 

J.J J 

cml ?^ is the probability of xj" . 

3.6 VARIANCE OF A LINEAR COMBINATION 

Before nresentint? a general theorem on the variance of a linear 
combination of random variables, a few key variance and covariance rela- 
tionships will be given. In the following equations X and Y are random 
variables and a, b, c, and d are constants: 

ERIC 



85 

Var(X+a) - Var(X) 

Var(aX) - a^Var(X) 

Var(aX+b) - a^Var(X) 

Cov(A+a,y+b) - Cov(X,Y) 

L..v(aX,bY) " abCov(X,Y) 

Cov(aX+b,cY+d) » acCovCX,Y) 

Var(X4-Y) - Var(X) + Var(Y) + 2Cov(X,Y) 

Var(X+Y+a) - Var(X+Y) 

Var(aX+bY) - a'^Var(X) + b^Var(Y) + 2abCov(X,Y) 
n lustration 3.9. The above relationships are easily verified by 
usinR the theory of expected values. For example, 
Var(aX+b) - E(aX+b-E(aX+b) j"^ 

- E[a:{+b~E(aX)-ECb)I^ 
« E[aX-aE(X)]^ 
« E{a(X-X)l'^ 
» a^E(X-X)^ » a^Var(X) 
Exercise 3. ,12 . As in Illustration 3.9 use the theory of expected 
values to prove that 

Cov(aX+b,cY+d) » acCov(X,Y) 
As in Theoten 3.3, let u - a^u^"»-. . .-♦■aj^Uj^ where aj^,....a^ arc constants 
and u t....u, are random variables. By definition the variance of u is 

1 K. 

VarCu) - E[u-K(u))^ 
By substitution 

2 

Var(u) * Ela^^u^-t-. . .+aj^Uj^-E(a^Uj^+. . .+a^^Uj,) ] 

= E[a^(u^~u^) + ...-+ i^(u^-Uj^)1^ uhere E(u^) - 



86 



By squaring the qurintltv in ( ] and considerinr. thr exnected values of 
the terns in the series* the foUowinr^ result Is obtained. 

variance of u, a linear conbination of random 
variables f is Riven by the followiap, equation 

^ 2 2 

Var(u) « w aT^. + ' " n,a,n,^ 
i i i if 5 i .5 H 

where is the variance of u^ and o^^ is the covarlance of and u^. 

Theorems 3.3 and 3.5 arc very useful bec*iusc many estimates from 
probability samples are linear conbinatlons of randon variables, 

I I lustration 3. 10> Suppose for a srs (simple random sat^ple) that 

data have been obtained for two characteristics X and Y, the sample 

values belnp, x,,.,.,x and v,,...,v . What is the variance of x-v? 
1 n 1 n 

Fron the theory and results that have been presented one can proceed 
innodiatel: to write tlie answer. Fron Theorem 3.5 we know that Var(x-y) « 
Var(x) 4^ Var(y) -2Cov(x>v). Frcin the sanplinr, specifications we know the 
variances of x and v and the cova lance. See r.nuatinns (3.9) and (3.10) 
Thus» the followinp result is easilv obtained: 

Var(x-y) « <|rf><n^)<^X ^ " -^XY^ ^^'^^^ 
Sone readers nipht be curious nbouL t!ie relationship between covar-- 
iance and correlation. By definition tlie correlation between X and Y is 

'^Y /Var(X)Var(Yf "^/fv 
Therefore, one could substitute r.^ ^^'^Y ^XY Kquatlon (3.13). 

Kxercise 3.13. In a statistical nublicatinn .^unposo vou find 
bushels per acre as the yield of corn In State A and 83 is the estimated 
yield for State The estiriated standard errors arc piven as 1.5 and 

er|c tjd 



^ COPY AWlABlf 



2.0 buRhfls. You bL'corv intorestcc! In the st.mdard error of thu differ- 
once in vli'l.i botvui-n tiu> tvo St..iti's an<l wr»nt to know Inrrc the 
ostiaateM .iiffcrcnco is in rt-lntinn to its stnncinn! orror. Find tiic 
stiind.ircl orror of ti^c .iifferoncc. Y<u. m.iv .nssune tnat the two vield 
estimates trv indcmn.Unt hcLMusc- the s.imnle selection in one f^tntc was 

complete Iv inJt'pi-ndi'nt o( llu- otlu-r. Answer: 2.5. 

tUuslration l.U. No dnuht students who are faniUnr with sanpUn- 

have already reo.M-nizcd the annlication of Theorem 1.3 and 3.5 to several 

sannlinr .^l.ins and nethods nf estination. For exannle, for stratified 

randor. s.ir p U nr„ an estinaLnr -^f" the population total is 

wiierc ;.- is tin pnnnl.itinn nMn;...r of sannlin- units in the i^'' stratun 
i 

and X is t.u' avcrare rer savinlinr, unit rf characteristic,:; fron a sample 

1 

of sanrlm- unit^ ; ro-, ti.e i stratun. According to Thec-ren 3.3 

!.(:.') - '. -^x. ::;.K(x.) 

If tUr s.i-^rlin" is su.h th U ! (Xj) = X. for all strata, x' is an unbiased 
estirute tU>- ponu 1 .n i rr> total. Accnr<iin- to Theon-f 3.', 

V tr(x') = Vnr(Xj) + . . •+ N"^ Var(x^) ^'^•^ 
There are .nv.riance terrr, in l.^uation (3.14) because the sample selection 
in one stratur, is i .u!c-nendent of another stratur. Assunin- a srs fno each 
stratur, :.nuition (i.I'O ^efor'cs 



where is the variance of X anon- s;.nrUnr units wif.in tin- i^^' stratun, 
i 



ERIC 



88 

lUMfltratton 3.12. Suppose x^,..»,x^ are independent estlwates of 
the sane quantity,!. That is, E(x') « T. Let be the variance of x'. 
Consider a weighted averape of the estimates, namely 

where Zv^ ■ 1. Then 

E(xO - w^ECxp +. E(x^) - T (3.16) 

That is, for any set of weights where Sw^ » 1 the expected value of x' is 
T. liow should the weights be chosen? 
The variance of x' is 

Var(x') - w;oJ +...+ wf of 

If we weight the estimates equally, - i and the variance oi Is 

1 ^"l 

Var<x') - ^ [—] (3.17) 

which is the average variance divided by k. However, it is reasonable to 

give mote welnht to estimates havinp low variance. Using differential 

calculus we can find the weights, which will minimize the variance of x'. 

The optimum weights are inversely nrc/portlonal to the variances of the 

estimates. That is, w^ « ~ 

"l 

As an exanple, suppose one hns two independent unbiased estimates of 
the same quantity which originate from two different samples. The optimum 
weighting of the two estimates would be 

2 x -r X 

2 2 
°1 ^2 



ERIC 



89 



As another example, supnose x^,...,x^ are the values of X In a sample 
of k sampling units selected with equal probability and with replacement. 
In this case each is an unbiased estimate of X. If we let " ^ • 
is X, the simple average of the sample values. Notice* as one would expect. 
Equation (3.15) reduces to E(x) - X. Also, since each estimate, , is the 
same random variable that could be equal to any value in the set Xj^,...X^, 

2 ^^^i"^^^ 

it is clear th.it all of the o^'s must be equal to o . Hence, 

2 

Equation (3.17) reduces to ^ which a.rees with the first part of Section 



3.5.1. 

. X 



i . 
in 



Exercise 3.U. It you equate x^ in Equation (3.15) with ™ 

Section 3.5.2 and let w * - and k = n, then x' in Equation (3.15) Is the 

i n 

same atj x' » — - in Section 3.5.2. Show that in this case r.qu;4tlon (3.17) 
n 

becomes the same as Enurtlon (3.12). 
3.7 ESTIMATION OF VARIANCE 

All of the variance formulas presented in previous sections have 
involved calculations fror.i a population set of values. In practice, we 
have data for only a sample, iience, we must consider means of estimatinr. 
variances from sample data. 

3.7.1 si>a'u: ka:^dom sampling 

In Section 3.5,1, we found that the variance of the mean of a srs is 

2 

J . BEST COPY AVAIUBIE 

Where * " 



« 90 
2 1 

As an estimator of o^ » ;: seetns like a natural first choice for 

\ n 

consideration. However, when sac^linr finite populations, it is custonary 
to define variance anonf units of the population as follows: 

and to iifte » ■ ^ ■ -j - as an estimator of • ANt^^ason for this 

Hill becOiQC apparent when we find the expected value of s^ as follows: 

2 

The forinula for s can be written in a form that is more convenient 
2 

^ for finding E(s ). Thus, 

« ^ 2 

S(x.-x) .2-2 
2 1 ^ Zic^-nx 

S m I. ... a i — 




n-1 n-1 



and E(s^) - ^ tSECx^) A nE(J^)J 



We have shotim previously that is a random variable that has an equal 
probability of being any .value in the set X^,...,X^. Therefore 

N^2 2- 

M • i n£X. / 

E(xp - ^ and £E(xp - 

Hence , • E(s^) m ^ ^ ^Gih j (3. 19) 

- 2 - - 2 

He know, by definition, that a- ■ E(x - X) and it is easy to show that - 

. *• - - 2 -2 -2 

E(x-X)^ - E(x^) - X^ . 



^2 2 ^^2 
Therefore, ECx ) » o~ + X . 



Hittlf 

97 . 



ERIC 



91 



By substitution in Eqxiation (3.19) we obtain 



EX? 



K(X -X) EX 

By definition o? " — ~— - - and since the specified method of 

• 2 2 

2 N-n ^'X „^ . JSL. ffl2 . N^n 

sampling was srs* aj - jpf ,~.» we bave E^s } • ^ io^ jj^j^ ^ J 



which after sioipllfieation is 
t(s ) - jpj Ox 

2 Z 

Uote fT0:3 the above definitions of and S that 
^ N-1 "x 



2 2 
Therefore E(s ) « S 



Since s^ is an unbiased estimate of S^, we will now substitute ^ for 
o^ In Equation (3.18) which gives . 

Var(x) .^fcJ^ i! ' (3.20) 

Both Equations, (3.18) and (3.^0), for the Var(x) give identical results 
and both agree with E(;~X)^ as a definition of variance. We have shown 
that 8^ is an unbiased estimate of S^. Substituting s for S in Equation 

ft 

(3.20) we have 

tes!^ (3.n) 

as an estimate of the variance of x. With regard to Equation (3.18), 
^ 8^ is an unbiased estimate of . When ^ s^ is substituted for 

a? , Equation (3.21) is obtained. 

Since in Equation (3.20), ^ is exactly 1 minus the sanpling fraction 
and s^ is an unbiased estimate of , there is some advantage to using 



92 

Equation (3.20) and S « — ^^^^ ■ ■ as a deflnlclon of variance anonR 

saiapling units in the population. 

Exercise 3. IS . For a small nopulation of 4 eletnents aupposft the 
vaXuss of X are X^^ -. 2, - 5, • 3; and X^ - 6. Consider siiwle 
random sampies of sise 2. There are six nossible samples. 

(a) For each of the six sanmles calculate x and s^. That is, 

find the sampling distribution of x and the sanrnlinc 

2 ' ' 

distribution of s . 

2 - 

(b) Calculate S , then find Var<x) usin<» Equation (3.20). 

(c) Calculate the variance amonp the six values of x and compare 
the result with Var(x) obtained In (b). The results should 
be the same. 

(d) From the sampllnr distribution of s^. calculate ECs^) and 

2 2 

verify that E(s ) • S . 
3.7.2 UNEQUAL PROBABILITY OF SELECTION 

In Section 3.5.2, we derived a fomula for the variance of the 
estimator x" where 

*i 

P 

*' " IT (3.22) 
The samplinR was with unequal selection probabilities and with replacement. 
We found that thewariance of x' was {(iven bv 

J:p^(p-X)'^ 

Var(x^).i i (3.23) 

As a formula ^or estimating Var(x') from a sample one mlsht be inclined, 
as a first fuesd, to try a formula of the same form as Eouation (3.23) but 



er|c . 



BBTCORiHHUBtE '3 

that itees not work. Eqwation (3.23) :1s a velghted average o£ the squatea 

• „ ■ X 

of deviations <s^ - X)^ which reflect* the unequal selection probabilities. 

^ . * ■ 

' if one applied the same welRhtinp syst^n in ^ formula for Estimating 

variance from a sample he would in effect be applyinis the weights twice; 

first* in the selection process itself and sec2>nd, to tfte sample data.^ 

' The unequal probability of selection is already Incorporated Into the 

" r * 

aanple itself. • 

As In sooe of the previous discussion, look at the estimator as followat 

._ ■ • ' ^ . 

1 J. . n - « 

P, p" < +•••+ K • *t ' 

« — - where x; • f*- 

* tt n . i P^ 

-( ^ ■ - - 

Each xr is an independent unbiased eafimate of "the population total. Since 

i . - * 

each value of receives an equal weiglit in determining x' it appears that 

the following formula for estimating Var<x') might work: 

var<x-) ' C3.24) 

n ■ • ^ 

E(x;-x')^ 

2 ij^ 

where s ■ 

n-i. 

By following an approach similar to that used in Section 3.7.1, one can 

• prove that 

2 ^ '^i 2 

That is. Equation (3.24) does provide an unbiased estimate of Var(x'*) In 
Equation (3.23). The proof Is left as sn exercise. 

Exercise .3. JL6. Reference Is made to Exercise 3.1, Illustration 3.7, 
and Exercise 3.11. In Illustration 3.7 the sampling distribution of x' 

o too 



^ / (See Efluation (3%22)> Is sivcii for sartnles of 2 ftoiti the population of 

* » • • . 

/ 4 elements thet was given in ,Kxerqise I.l. 

7 .' ■ ■• . • ' -• 2 •■ 

/ • (a) Compute .var(x') * (liquation <'J.24)) for e^ch of the 10 
1>ossible samples e 

(b) Conpute the eKpectad value of var(x^) and conpare It with the 

• pi 

result obtained In Exercise 3.11. The results should be the 
aatne,.:, Rchenbor, when finding the expected value of var(x'') , 
that the k'^s Uo not occur with equal frequency. 
3.8 RATIO OF TWO . RA?JDOM VARIABLES 

In sampling theory and practice one frequently encounters estimates 
that are ratios of random variables. It was pointed out earlier that 

^ l(w>' " random wfiables. Formulas for the expected 

value of a ratio and for the variance of a ratio will now' be presented 



without derivation. The formulas are approximations: 



2 ' 2 

Var<H) 4 lH,2l!| + !| . "" l (3.26) 

WW w uw 



where u • E(u) 



w » E(w) 

2 - 2 

ar » E(u-a) 



c 

^uw ' o^"^ ^uw * H<u-u)(w-w) 

u w 



For a discussion of the conditions under which Equations (3.25) and 
^ (3.26) are Rood approximations, reference is made to Hansen, Hurwitz, and 



95 



M«di»tf. 2/ The conditions arc usually satisfied with reRard to^esttwates 

from sainple surveys. As a rule of thumb the variance formula is usually 

accepted as satisfactory if the coefficient of variation of the variable 

o 

m the denominator is less thaii O.l; that is, if — < 0. 1. In other words. 

w 

this condition states that the coefficient of variation of the estloate in 
the denominator should be less than 10 nercent. A larger coefficient of 
variation miRh , be tolerable before becoming concerned about Equation (3.26) 
as an approximation* 

The condition ^ < O.l is more strlneent than necessary for regarding 

. w , • * 

the bias of a ratio as nenliRlble. With few exceptions in practice the 
bias of a ratio is iRttored. Seme of the logic for this will appear in 
the illustration bel«/. To sumroariae, the conditions when Equations (3.25) 
- and (3.26) are not Rpod approximations a^-e such that the ratio is likely to 
be of questionable value owinR to lar«e variance. 

If u and w are linear combinations of random variables, the theory 
presented in previous seitlons applies to u and to w. Assuming u and w 

are estimates from a sample, to estimate Var(^) take. into account the 

— 2 2 

sample design and substitute in Equation (3.26) estimates of u, w, a^, o^, 
and p^. Ignore Equation (3.25) unless there is reason to believe the.bias 
of the ratio might be important relative to its standard error. 

It Is of interest to note the similarity between Var(u-w) and Var(~). 
According to Theorem 3.5, 

2 2 

Varfu-w) " a + O - 2p O O..' 



ERIC 



2/ Hansen, Hurwitz, and Madow, Sample Surv$^ Methods and Theory, 
VoTume 1, Chapter 4, John Wiley and Sons, 1953. 

102 



. 96 

Sy definition the relative variance of an estimate is the variance of the 

estimate divided hy the sniiare of its ex;>eeted value, thw, in texms of 

the relative variance of a ratio. Equation (3.26) can he written 

* 2 2 
Rel Var(S) - ^ + ^ - 2p ~ 

The 8iiailari*bV is an aid to reraeniberinf?, the formula for Var(~) . 

musjration .303. Suppose one has a simple random sample of n 
elements from a population of N. Let x and v be the sample means for 
characteristics X and Y. Then, u • x, w » y, 

a2 . ^iia h «„d ^2 . hhn 
u N n w N n 

'^otlce that the condition discussed above, ^ < 0.1, is satisfied if the 

w 

sartple is large enough so 

N-n *Y ^ ,2 

— "^2 * ^'^ 
^ nY 

Substituting in Equation (3.26) we obtain the following as the variance o^f 
the ratio: 

y « n x*^ Y^ XY 

The bias of as an estimate of - is given by the second tern of 

y Y 

Equation (3.25). For this illustration it becomes 

2 

(-j5~)(-) ~ [-2* — — ] 
Y Y* XY 

As the si2e of the sample increases, the bias decreases as ^ whereas the 
standard error of the ratio decreases at a slower rate, namely — . 

11>3 



' - ' 97 

I. 

Thtt«, we -need not be concerned about a poaatbiUty of the blaa becosdnR 
inportant relative to aampllnft error aa the else of the aample Increaaea. 
A noaalblc exceptlW occurs when leveral ratios are coirtbined. An example 
ie atratteied rjoddm saiBpllnR when'many strata are involved and aeparate 
rati^stl«ate» are tnade for the atrata. This la diacuased in the books 

on sflttpllns* J ' \ 

C»NDITIONAL EXPECTATION 
The theocy for conditional expectation and conditional variance of a 
random variable is a very important part of sampling theory, especially 
m the theory for wultistaf^ samplinft. The theory will be discussed with 
reference to two-stase sampling. 

Tjie notation that will be used in this and the next section la as 

follows: 

M is the nuTri>er of psu's (primary sampling units) in the population. 

m Is the number of pau's in the sample. / 

th 

N is the' total tiuri>er of elements in the 1 psu. 
1- 

♦ 

M ' 

N ".SN is the "total nuis*er of elements in the population. 
I- * 

th 

n Is the sample number of elements from the 1 PSU. 
1 • . 

n » En is the total number of elements In the sample. 
. 1 ^ 

. - n 

m 1 
X^^ la the value of X for the J^** element in the 1 osu. It 
refers to an element in the population, that is, j - I,..., 
and 1 ■ I,..., M, % 



ERIC 



104 

/ 



9S 



is Che value of X for the elenent In the aamfile from Che 
ch 

i psu In Che samole, chat is, Che Indexes i and 1 refer Co 

%■ ■ - , * ' ' - 

Che aec of psu'a and elertenCa in Che aamplc* 

th 

^i. " '^^^ij ^® '^^^ populaclon cocal for Che i psu. 



^i. * S^' average of X for all" elements In Che iF** psu» 

X,. * - • is the averafte of all N elemencs* 

M ' « . - ■ 

EX. • 
- 1 

X. " is the averafce of Che pau Cotals. Be sure Co note the 

difference JteCween X and X. * 

Xj^. • S is. the sample, total for the i psu in the sample. 

^ *1. * xT* ^® averaRe for the n^ eletnents in the sanmle from 
the i'** p^u. 

■ is the average for all elements in the satnnle. 

Assume simple random sampling, equal probability of selection without 

# 

replacement, at both stages. Consider the sample of n^ elements from the 

th _ 
^ 1 psu* We know frboi Section 3.3 that x^, is an unbiased estimate of the 

psu mean X^, ; that is, E(Xj^) - X^^ and for a fixed i (a specified psu) 

ENjXj^ ■ NjECxj ) NjXj^ * ^1. • ^"*^» owinR to the first stage of sampling. 



105 



KSICOnAlltUBU ' 9^ 

EN^Xj must be treated ad a rondon variable. Hence, it is necessary to 
become involved with the expected value of an expected value. 

.First, consider X as a randon Variable, In the context of single- 
stage sampllnr, which could equal any one of the values X tn the 

M ' 
population set of N • . Let P(iJ) be the probability of selecting 

i * 

the j'^ element in the i*^^ psu; that is, P(i.1) is the nrobabilicy of X 
bcinp equal to X^^. By definition 

E(X) - En^PdDX C3.27) 
11 

Now consider the selection of an eletnent as a two-sten procedure: 
(1) selected a psu with probability P(l) • and (2) selected an element 
within the selected psu with probabllitv P(l|i). In. words, P(.l| i) is the 
probability of selecting the elenent In the l^^ psu given that the 
i^** psu has already been selected. Thus, P(IJ> • P(l)P(j|i). By sub- 
stitution, equation 0.27) becones 

MN. 

E(X) - SS^P(i)P(j|l)X^^ 



z 
j 



E(X) m SPd) sV(l{i)X.. 

t j ^ 



0^ Kfxin^.Kiiij. riitxvA... 28) 



N 



By definition, S^PjCj|i)X^^ Is the expected value of X for a fixed value 

of 1. It is called"conditional expectation." 

N. 

Let E,(x|i) - Z^POlDX where E.CXii) is the form of notation we 
J ^ 

will be usin^ to designate conditional expectation. To repeat, E2(X| i) 
tneans the expected value of X for a fixed 1. The subscript 2 indicates 

^ ■ 106 ^ 



100 

that the conditional expeotdtlon a!>plie8 to the teeend stait* aanpling, 

and will refer to expectation at the first and second stages. ^ 
resfiectlveiy. 

Substituting EjUlO in Equation (3.28) we obtain 

M ' ■ 

E(X) • SP(1) E (X|i) (3.29) 
1 

There is one value of E^C^^I^) for each of che M psu's. In fact Z^(x\iy 
is a randon variable wftere the rirobablllty of E^CXll) Is P(l). Thus the 
right-hand side of Equation (3*29) is, by definition, the expected value 
of E^CXji). This leads to the following theoren: 
Ilie9.ren^.j6. E(X) - Ej^EjCXji) 

Suppose P(jli) - ^ and P(l) - g . Then, 



J * 



* • M EX 

and ECO - EjCX^.) - ^^R>CXj.) - 

In this case is an unweighted average of the psu averages. It Is 
important to no^te that, if P(i) and P(l|i) are chosen in such a way that 
P(.1.1) is constant, every element has the same chance of selection. This 
^oint will be discussed later. 

Theorem 3.3 dealt wltii the exnected value of a linear combination of 
randob variables. There is a corresponding theorem for conditional expecta- 
tion. \ Assume the linear combination is 



k 



where aj^»..*»a^ are constants and u^,»..,U|^ are randpra variables. Let 
£(U|e.) be the expected value of U under a specified condition, c., where 
c^ is ot|e of the conditions out of a set of M conditions that could occur. 
The tlieorem on conditional expectation can then be stated sinsbolically as 
follows : 

gyoren E(u|Cj^) ■ aj^E(uj^lcj) •♦■...+ a^E(uj^|cj) 

.\ k 

or ECU|c^) - £a^E(Uj.|c^) 

Cobpare Theorens 3.7 and 3.3 and note that Theorem 3.7 is like 
Theoteiii>3.3 eiccept that conditional expectation is applied. Assume c is ; 
a random event and that the probahility of the event c. occurring is P(i). 
Then E(U|c^> is a random variable and by definition the expected value of 

M ■ 

E(U|c.) is 2:P(i)E(u|c ) which is E(U). Thus, we have the following 
i 

theorem: ' ' 

m 

Theore m 3.8 . The expected value of U is the expected value of the 
conditional expected value of U, which in symbols is written as follows: 

ECU) - EE(u|cj^) (3.30) 

Substituting the value of ECUjc^) from Theorem 3.7 i^ Equation (3.30) 
we have 

k 

E(U) - ElajE(Uj|cj)+...+a,^E(u^ic^)J - ECEa^E(Uj.| c^ I (3.31) 

Illustra tion 3.14 . Assume two-stage sampling with simple random 

sampling at b'>th stages. Let x", defined as folloi/s, be the estimator of 

•' . 
the population total: 

„ m n. 
" 1 "l. j 

er|c ids 



Exercise 3»17* Examine the estinator» x"*, Equatiofr'-<3.32). Express 
it in other foros that misht help show its logical structure. For cxcraple, 

for & fixed i yhat is t x.. 7 Does it seem like a reasonable way- .of 

J ij 

estimating the population total? 

To display x' as a linear combination of random variables it is 
convenient Xo-ekpress it in the following fom: i'-"*^" 

x" « -4^ X, ,'!•...+ ^ 4 X, AHh [ii -a X , . 9 ;-Sl X (3. 33) 

'm n, 11 m nlT lni_L ^m n ml m n nn ^ 

1 1 1 . . ra mm 

Si^pose we want to find the expected value of x' to dete<#fiine wh^tHer it 

■• ' * • ,. 'k 

is equal to the population total. According; to Theorem 3.8 » ' 

E(x') - Ej^Ej (x'ii) ' (3.3^) 

■■ ' » ■ 

E(x') E,Bj{{^?^sVj|i} 0.35) 

■' 

Equations (3.34) and (3.35) are obtained simply by substitutinf» x' as 
the random variable in (3e3Q)e The c^^nm refers to any one of the m 
ps^u^s in the samf^lee First ve must solve the conditional expectation* . 

E^^x*"!!). Since and are constant with respect to the conditional 

jZ ' ra n. 

expectation » and makinp use of Theorem 3. 7»' we can write 

We know for any given .psu ib the sample i that x^^ is an element in a 
simple random sample from the psu «»id ajecording to Section 3.3 its 
expected value is the t»M mean, X^^ • That is, 

109 



103 



and t 

\ 



StdmtiCttting the result frois Equation (3.37) In B^^uatlon (3.36) gives 

V*'l^> ""S^ Vi. ^^•^•^ i 

Next we need to find the expected value of E2(x'|i).' In Equation ^ 
(3.38)> is a random variable, as veil as X.^, associated with the first j 
stage of sampling. Accordingly* we will take X^^ •* N^X^^ as the random 
variab^ which gives in lieu of Equation (3^38). 

Therefore* m ^ 



From Tlieorem 3*3 



Since 



?Ej^(X^.) - m[^} 



Vifi.J-fi. 

y ■ ■ . 

Therefore, E(x'*) • S X. • This shows that x' is an unbiased 

i \ 

>^8tinator of the populatioii total. 
3.10 GOMnmONAL VARIANCE 
^ ; Conditional variance refers to the variance of a variable under a 
specified condition or limitation. It is related to conditional prob- 
ability and to conditional expectetion. 

-ERIC 



V ' 104 

! 

To find ehe variance of (See Equation (3.32) or (3*33)) Che following 
important thec^rem will be used: 

Thcoreia 3;9> The variance of.x' is Riven by 
V(x') i Vj^E^Cx'ji) + fi^V2(x1i). 

where Vj^ is' the varimi'ce for the first stage of samplinf; and Is the 
"conditional** variance for the second sta^e. 

We have discussed E^^Cx^ji) and noted there is one value of E2(x'|i) 
for each psu in the- pOpulatlon. Hence Vj^E^Cx'l i) is simply the varijsnce 

.of the M values of E2('«'i*^^* 

In Theorem 3.9 the conditional variance, V^Cx'l 1) , by definition is 
V2<x1i) - Ej^U'-E^CxlDl^ |i> 

To understand V,(» 'I i) think of x' as a linear combination of random 
variables (see Equation (3.33)). Consider the variance of x' when i is . 
held constant. All terms (random variables) in the linear combination 
are now constant except those originating from s amp 1 ins within the t 
psu. Therefore, V2(x'|i) is associated with variation mon?, elements in 
the i*"** psu. V,(x'ji) is a random variable with M values in the set, one 
for each psu. Therefore, Ej^V2(x'|i) by definition is 

i . ' M 

E^V2(x'|i) - SP(i)V2(x'*|l) 

That is, t^V2(x'|l) Is an average of M values of V2(x'|i) weiphted by 

P(i), the'^robability that the 1^^* psu had of beinr, In the sample. 

'/ 

Three illustrations of the application of Theorem 3.9 will be given. 
In each case there will be five steps in flndlnp. the variance of x': 
Step 1, find E2(x'|i) 
Step 2, find Vj^E2(x'|i) 



105 

I 

♦ r 

Step , 3. find V^Cx'U) 
Step 4, find E-V,(xii) 

f - • < 

Step S> cofsd>lne results from Steps 2 and 4. 

Illu atratton 3»I5» Tliis Is a simple Illustration* selected because 

we know what the answer' is fron previous discussion and a linear con^lna- 

tlon of randtm variables Is not Involved. Suppose In Theorem 3»9 is 

simply the* |^andom variable X where X has an equal probability of belns . 

any one of the X. . values In the set of N • SN . We know that the 

variance of X can be expressed as follows: 

1 ^"^i - 2 ' 
V(x') • i SE^CXj^-X..)'^ (3.39) 

In the ease of two-stage sampling; an equivalent method of selecting a 
value of X is to select a psu first and then select an element within the 
psu, the condition belnf; that ?(lj) - P(l)P(j|l> " ^ • This condition is 

■ ' - '-^^i ■ ' ■ ■ 1 ■' 

satlailed by letting P(l> •» ^ and F(j|i) » ^ . We now want to find 

V(X) by using Theorem '3.9 and check the result with Equation (3,39). 

Step 1. Prom the random selection specifications we know that 

E2(x'|i) • X^^ . Therefore, 

Step 2, V,E,(x'|l) ^ V,(X, ) 
I 12.11. , 

tfff know that X^^ 19 a random variable that has a probability of^^p of being 

^ th - ' NJ 

equal to the 1 value in the set X^^.^.^ • Therefore il)^ by definition 

of the varlanee of a random variable, 

v^E(x-|i)«E ir<'\.-^..> <^-*''> 

^ M 
. . fl. 



106 



Step 3. By defialtion 



Step 4. Since each value of V^Cx'li) has a probability j~ 

M N N 



om Equations (3.40) and (3.41) we obtain 

1 - - 2 - 2 

V(x') - i m AX -X..)^ + Z (X. -X, 

i 1 j ^» ^ 




(3.41) 



(3.42) 



The fact that Equations (3.42) and (3.39) are the s^e is verified 
by Equation (1.10) in Chapter I. 

Il lustration 3.16 . Find the variance of the estimator x' given by 
Equation (3.32) assuming simple random sanplinn both stages Of sampling. 

Step 1. Theorem 3.7 is applicable. That is, 

which means *'5;um the conditional expected valued of each of the n terma 
in equation (3.33)." 

With regard to any one of the terms In Equation (3.33)^ the 
conditional expectation is 

hA-^ — X. . ij » — — E^(x. . i) » — — X. • 

n^ Ij* ' n n^ 2^ ij» ' ^ ^i ^* ^ 



Therefore 



fsn X 

E,(x1i) 
2 n n^ 



(3.43) 



With reference to Equation (3.43) and sutmninp with respect to j» we have 



ERIC 



113 



107 



^ m n i. 

Hence Equation (3.43) becomes 

E,<x'ii) -I ?X,. , / (3.4A) 

Step 2. Find V^E^Cx'ji). This Is simple because --jp In Equation 
(3.44) Is the mean of a random sample of m from the set of psu. totals 
j^. ••!••» »-,. Therefore, \ . 



\ 

jere 



^2 / 

sex. -x.)* % ^x. , 

2 1 ^' . V 1 ^ / 

%i M — K'— i ■ 



In the ai^scrlpt to o*, the "b" Indicates between psu variance and*'l 
distinguishes this variance from between psu variances In later IXlustra'* 

tlOttS. ■ 

Step 3* Finding ^Ay(^'\i)^ Is mor^; Involved because the conditional 
variance of a linear cos&lnatlon of random variables must be derived* 



However, this is analogous to using Theorem 3.5 for finding the variance 
of ^ linear eond>lnatlon of random variables. Theorem 3.5 applies except 
that V(u|l) replaces V(u> and conditional variance and conditional co- 
variance replace the variances and covarlances in the for^laXfor V(u). 
As the- solution proceeds, notice that the strategy is to, shape) the problem 
so' previous results can be used,; 

Look at the estimator x'. Equation (3.33>, and detemlne whether any 
G(iv4|rlances exist. An element selected from one psu is Independent of an 



.108 



elentent selected ttcsm another; but within a psu the situation Is the nasoA 
as the one we had wjien finding; the variance of the nean of a sltBf>le *rattdom 
safn>le. This suggests writing x' In tenns of because the x^/s are 
Independent. Al^cordlnftXyt we will start with 



m 



\ m ^ 1 1* 



Hence 



t 

i - 

1 



\ 



Since the x^/a are Independent 



V2(x;|i).4 K^hhM^ 



and since Is constant with regard to the conditional variance 



r^CxlD- ^ S N v,(x,Jl) 
o 1 



(3.46) 



Since the sampling within each psu Is simple random stunpllng 



(3.47) 



where 



2 ^4 1 - 2 

Step 4. After si&stltutlng the valine of V2(x4;1l) In Equation (3.4i&)» 
and then applying Theorem 3.3» we have^ 



.2 m 



N.-n. 



ERIC 



Since the first eta^e of aanplinft was simple rando^ sampling and each psu 
had an equal chance of bein^ in the sample ^ 

115 



109 



Hence 



2 

Step 5. Combining Equation (3*48) and Equation (3»A5) the answer is. 

• 2 2 

* v(x*) -M^.g::! s M?^^^ (3.49) 

Illustration 3»I7. The sampling specifications are: (1) at the first 
stage select m psu*s with -reolacement and probability P(l) * ]^ t and (2> 
at the second stage a simple random sample of n elenents is to be selected 
fron each of the m psu*d selected -tit the first stage. This will give a sam- 
ple of n » en elements. Find the variance of the sample listis^te^of the 
population total. 

The estimator needs to be changed because the psu*s are not selected 

with equal probability. Sample values ig^cT to be weighted by the recip~ 

^ irocals of their probabilities of selection if the estimator is to be 

'unbiased. Let 

P^Cij) be the probability of element ij being in the sample, 

th 

be the relative frequency Of the i psu being in a sample 
of m, and let 

P'Cjii) equal the conditional probability of element iJ being in 

the sample given that the i*^^ psu is already in the sasiple. -. 
Then / 

>'*(ij) - p'Ci>p'(4!i) ' 

According to the sampling specifications P'(i) m jj- . This prob- 
ability was described as relative frequency because "ptobability of being 

im . : ) lie . '.. ' 



110. • 

t 

in a sample of m psa's'' is subject to misinterpretaclon^ The 1 psu . 
can appear In a sarnie nore than once and it Is counted every tiiae it 
appears. That Is, If the i psu Is scflected more than once, a sanple of 
n is selected within the i^*^ psu every tine that it is selected. By 
substitution 

p-(ij) - [n. Jii a- . f . I <»•»> 

Equation (3.50) neans that every element has an equal pmhability of beinf. 
in the sample. Consequently, the estinator Is very simple, 

x--5^ SEX,, (3.51) 
mn * Ij 

Exercise 3»18. Show that x'. Equation- (3.51) , is an unbiased estimator 
of the poDulation total. 

In finding; V(x'') our first sten was to solve for E2(x''|i). 

* 

Step 1. By definition 

* IM 

K2(x1i) - - S^^Ix l|i) 

ron ij ► 

* 

Since i is constant with reptard to E^* 

„ mn 

E,(x1i)«— E,(x..|i) (3.52) 
^ pm ij 

Proceeding from Equation (3.52) to the follot^rinR result is left as an 
exercise: 

E,(x1i) - I EX, (3.53) 
.Step 2. From Equation (3.53) we- have 



Ill 



Siftce the X^/s are inidependent gggf 

in X 



Because Che first 8taf»e of sanpllnf^ la sanplinp. with probability nropor* 
tional to and vlth replacement* 

i 



Let 



Then 



Exercise 3.19 . Prove that E(X^,) • X.. which shovs that it is 
appropriate to use X,. in Equhtion (3.54). 

Step 3. To find V2(x'|i), first write tlie estirtator as 

X' - ^ S X, . (3.56) 
m J i. 



Then» since the x^/s are independent 

Ill 1 



and 



where 



7 \ - 2 



ERIC • lis 



112 



Step 4. 

Sisice the probability of V2(x'|i) is |p 



„2 , n M N. N.-n , 
m n i i 1 



uhie^ becomes 



„2 MN N-n- • 
^ ^ wn i "i- * * 

t 

Step S» Conbinlng Equation (3*55) and Equation (3.57) we have the j 
ansver 

2 

v(x') - t"^ *h ^F^^^^^i^ ^^'^^^ 



119 



CHAPTBR XV. THE DXSTRIBUnOH OF AN ESTIMATE 

4.1 PROPERTIES OF SIMPLE BANIXm SAMPLES ' , 

The discribution of §a estimata is « priiMry basis for Judging Che 
seeursey of an astinata from a san^le survey* But an e^tinate is only 
ona niagl>er. How can ona nunber have a distribution? Actually, "distri-* 
btttion of an astiaata** is a phram that refers to tiie distribution of 
all possible estinates that mi^t occur under repetition of a nrescribed 
saaq^ling plan and estinator (nethod of estimation). Thanks to theory 
and ttttpirical testing of the theory, it is not necessary to generate 
pbysieially the distribution of an estipate^.by selecting numerous sasiples 
and naking an estioate fron each. Uotfmr, to have a tangible distribu- 
tion of an estioate as a basis for discussion, an ^illustration has been 
prepared. 

Illustration 4.1 . Consider simple random samples of 4 from an 

-.«a ^..^ o. s ^„ ^ ,0 

samples. In Table 4.1, the sample values for all of the 70 possible sam- 
ples of four are shown. The 70 samples were first listed in an t>rderly 
Mmner to facilitate getting all of them accurately recorded. The mean, • 

IN < 

X, for each sample was computed and the samples were then arrayed 
according to the value of x for purpoiies of presentation in Table 4.1. 
The distribution of x is the 70 values of x shown in Table 4.1, including^ 
the fact that each of the 70 values iyf x has an equal probability of being 
the est^uite. These 70 values have been .arranged as a frequency diatribu* 
tion in Table 4.2. 

As discussed previously, one of the properties of simple random 
sampling is that the sample average is an vmbiased estimate of the popu- 
lation average} that is, E(x) • t. This means that the distribution of 



Table 4*l«»«-SampIe8 of four alerwnta from a population of oight 1,/ 



S^le I 
nuttber : 



Values of 



s 



: { 
: Saiqple: 
: nutBbers 



Values of 



s 



x ♦ 



Ic 


2. 


>1< 




>4 


3.25 


4.917 s 


368 


1. 


»6, 


.8.9 


6.00 


12.667 


2 


2| 


• Xi 


,4, 


,7 


3.50 


7.000 J 


37s 


1< 


>A, 


8,U 


6.00 


19.333 


3 


2, 


>i< 


»4, 


>8 


3.75 


9.583 : 38s 


2, 


»6i 


.8.9 


6.25 


9.583 


4 


2, 


• li 


»6, 


,7 


4.00 


8.667 : 39s 


2< 




8>11 


6.25 


16.250 


5 


2, 




.4, 


,9 


4.00 


12.667 : 408; 


1< 


»6, 


,7,11 


6.25 


16.917 


6 


2, 


.1, 


,6, 


>8 


4.25 


J 

10.917 : 4l8 


1< 




,11,9 


6.25 


20.917 


7 


2, 


»1 




.9 f 


4#50 


13.667 t 42 


li 


► 7, 


,8,9 


6.25 


12.917 


8 . 


2, 


,1 


>4 




4.50 


20.333 : 43cs 


6 


i4, 


,7,8 


6.25 


2.917 


9C8 


2 


>l 


»7, 


»8 


4.50 


12.333 : 448 


2 




.7,11 


6.50 


13.667 


10 


1( 


»6 




,7 


4.50 


7.000 : 458 


2, 




,11,9 


6k50 


17.667 


lis 


• 2, 


,1, 


.7, 


,9 


4.75 


• 

14v917 ! 46 


42 


,7 


,8,9 


6.50 


9*667 


12 


2 


>6 


i4 


,7 


4.75 


4.917': 47s 


1< 


»6, 


^8f 11 ^ 


6.50 


17.667 


*3 


I 


>6< 


»4, 


»8 


4.75 


8.917 I 


48s 


6 


>4< 


,7,9 


6.50 


4.3^3 


14 


2 


>1 


>6 


,11 


5.00 


20.667 t 498 


2 


>6 


|8 f 1 1 


6.75 


14.250 


ISs 


2, 


,1 


>8 


,9 


5.00 


16. 6 W ! 

1 


' 508 


i 


>6 


,11.9 


6.75 


18.917 


16 


2 


>6 


»^ 


>8 


5.00' 


1 

'6.667 1 


! 51 


1 


.7, 


,8,11 


6.75 


17.583 


17 


1, 






,9 


5.00 


11.337 1 


! 52s 


6< 


,4, 


,8,9 


6.75 


4.917 


18s 


1 


|4 


,7 


>8 


5.00 


10.000 I 53s 


2 


.6 


.U.9 


7.00 


15.333 


198 


2, 


,1 


.7, 


ai 


5.25 


21.583 : 54 


■2 


,7 


,8,11 


7.00 ^ 


14.000 


20 


2 


>6 




,9 


5.25 


8.917 ' 


; 55 


i 


,7, 


,11,9 


7,00 


18.667 


21s 


2 


>^ 


,7 


,8 


5.25 


• 

7.583 f 56s 


6, 




,7,11 


7.00 


8.667 


22s 


I. 


>A 


,7 


,9 


5.25 


12.250 ! 


I 5i 


4 


,7 


,8,9 " 


7.00 


4.667 


23s 


2 


,1 


.8 


,11 


5.50 


23.000 : 58 


2 


,> 


,11,9 


7.25 


14.917 


24s 


2 


|4 


,7 


,9 


5.50 


^ 9.667 i 


I 59 


1 


>8 


,11,9 


7.25 


18.917 


25 


1 


|6 


>4 


,11 


5.50 


17.667 ! 


r 60s 


6 


>A 


,8, '11 


7.25 


8.917 


26s 


1 


>6 


,7 


i8 


5.50 


• 9.667 i 


1 61 


2 


,8 


.11.9 


7.50 


15.000 


27s . 


1 


i4 


i8 


,9 


5.50 


13.667 : 62cs . 


6 


,4^ 


,11.9 


7.50 


9.667 


28cs 


2 


,1 


,11,9 


5.75 


24.917 ! 


1 63 ' 


6 


,7 


.8,9 


7.50 


1.667 


29 


2 


.6 


,4 


,11 


5.75 


14.917 r 64 


4 


,7 


,8|11 


7.50 


8.>333 


30s 


2 


.6 


,7 


>*8 


5.75 


6,917 1 


; 65 


4 


,7 


,11 ,9 


7.75 


8.917 




. 2 


,4 


>8 


,9 


5.75 


10.917 i 


! 66 


. 6 


,7 


,8\ll 


8.00 


4.667 


32s 


1 


>6 




,9 


5.75 


11.583 ! 


i 67 


4 


.8 


,11'.9 


; 8.00 


8.667 


33s 


1 


>4 


,7 


,11 


5.75 


18.2^0 t .68 


6 


,7 


.11.9 


8.25 


4,917 


34s 


2 


i6 


,7 


,9 


6.00 


8.667 69 


6 


!i 


,11.9 


8.50 - 


4.333 


35s 


2 


i4 


,7 


.11 


6.00 


15.333 : 70c 


7 


,8 


,11.9, 


8.75 


2.917 



ERIC 



1/ Values of X for Che populscion of eight elements are ^. « 2, X. « 1, 
Xj"*" 6, X^ • 4^ Xj - 7, Xg - 8, X^ - 11, Xg «*9; X - a.OO; *and ' ^ ^* 

■j E(X.-)C)^ , ■ „ 



BEST oon «IMUU£ 

I 

Table 4.2— Sarnnllnf? distribution of x 
Relative frequency of x 



• ^ _^ — .]T i n TT 1--J I - - I I 

; I Sl»ple random ^ .^ung i""Ji2unr""" 

I ^ ^ '..J 

k2J 1 1 





1 




I 






3.50 


I 




• 






3.75 


1 • 










4.00 


2 


• 








4.25 


I 










4.50 


4 




I 


1 




4.75- 


3 






X 




'5.00 
5.25 


5 
4 






^ 3 




'5.50 


5 






4 




5.75 


6 




1 ^ 


5 




6.00 


4 






4 




. 6.25 


6 




I 


5 




6.30 


5 


• 




4 




6.75 


4 






3 




i • uu 








2 




7.25 


3 






I 


# 


h50 


4 




I 


i 

X 




7.75 


1 










8.00 


2 










8.25 


I 










8.50 


I * 










8.75 

* 


I 




I 






Total 


70 




6 


36 




Expected value 
of X 


6.00 




6.00 


6.00 




tm 

Variance qf x 


1.50 




3.29 


0.49 





182 



% iM Miitftred on X, Xf the theory is eorT«eC» the «v«r«s* of s for the 
70 ■aap2et, which are equally likely to oeeur, should be equal to the. 

• 

population average, 6.00. The average of the 70 saaples dees equal 6.00* 

From the theory of expected values, we also know that the variance 
of X is given by 

2 ' 

, where 

2 i 

2 2 

With reference to Xllustrntion 4.1 and Table 4*1, S » 12.00 and S> « 

" 1.5 . The formula (4.1) can be verified by eenputing the 

variance mong the 70 values of x as follorat 

(3.25-6.00)^ -f (3.50-6.00)^ (8.75-6.00)^ . , - 

70 

2 ' 

Since S is a population parameter, it is usually unknown. Fortu- 

2 2 

nately, as discussed in Chapter 3, E(s ) » S where 

2 i ^ 

S.v ■ * ; 

2 

In T^ble 4.1, ihe value of s is shown for each of the 70 ssfl^les. the 

2 2 2 2 

average of the 70 values of s is equal to S . The fact that E(s ) ■ S 

2 

is another ;lnport ant property of siaq>le random saaples. In practice s ie 

2 

used as an estimate of S . That is, 
•x H n 

is an unbiased estimete of the variance of x. 

« 

To recapitulate, we have just verified three important properties of 

Q eimple random samples t 

ERIC 



(1) £(x) .- X 

a, sj./^ f 

F 

The standard error of x, namely S- , Is a measure of how much x varies 
under repeated sau^llng from X. Incidentally, notice that Equation (4.1) 
shows how the variance of x is related to the size of the sai^le. Now 
«re need to consider the form or shape of the distribution of x. 

Definition 4.1 . The distribution of an estimate is often called the 
sampling distribution. It refers to the distribution of all possible 
values of an estimate that could occur under a prescribed sampling plan. 
4.2 SHAPE OP THE SAMPLING DISTRIBUTION 

For random sampling there is a large volume of literature on the 
distribution of an e&timate which we will not attempt to review. In 
practice, the distribution is generally accepted as being normal (See 
Figure 4.1) unless the sample si«e is "small." The theory and empirical 
tests show that the distribution of an estimate approaches the normal 
distribution rapidly as the sisc of the saiaple increases. The doNieness 
of the distribution of an estimate to the normal distribution depends on: 
(1) the distribution of X (i.e.. the shape of the frequency distribution 
of the values of X in the population being sampled) , (2) the form of the 
estimator, (3) the sample design, and (4) the sample sise. It is not 
possible to give a few simple, exact guidelines for deciding when the 
degree of approximation is good enough. In practice, it is generally a 
matter of working as thou^ the distribution of an estimate is normal but 
being mindful of the possibility that the disfcifibution might differ 



1X8 




Figure 4,l--Dt8tribution of an estlnaCe (normal distribution) 

1 



considerably from normal when the sa]i4>le is very snail and the population 
distribution is highly skewed. 2/ 

It is very fortunate that the sampling dlstrlbutiosi is approximately 
normal as it gives a basis for probability statements about the precision 
of an estimate. As notation,x' will be the general expression for any 
estimate, and is the standard error of x". 

« 

Figure 4.1 is a graphical representation of the smiling distribution 
of an estimate. It is the normal distribtftion. In the mathematical 
equation for the normal distribution of a variable there are two paramatcrs: 
the average value of the variable « and the standard error of the vari^le* 



3/ For a good discussion of the distribution of a sample estimate, see 
Vol. I, Oiapter 1, Hansen, Hurwitz, and Madow. Sample Survey Methods and 
Theory, John Wiley and Sons, x9S3. 



119 

Sttppose Is an sstlmate from a probability sample. The characteristics 
of the sampllfig distribution of are specified by three things: (1) the 
expected valiie of x', E(x'), which Is the wean of the distribution; (2) the 
standard error of x', a^^, and (3) the assumption that the distribution is 
normal. If x' is normally distributed, two-thirds of the values that x' 
could equal are between lE(x') - a^.l and [E(x') ^ a^,U 95 percent of the 
possible values of x' are between (E(x') - 2c^J and tE(x') + ^^^U •nd 

ji , . / *x ■ 

99.7 percent of the estimates are within 30^^^ from E(x ). 

Exercise A. 1. With reference to Illustration 4.1, find E(x) - o- and 
*E(x) •»• 0- . Refer to Table 4.2 and find the proportion pf the 70 values 
of X that are* between E(x) - o- and E(x) + o- . How does this compare\rlth 
the eiq>ected proportion assuming the sampling distribution of x is normal? 
The normal approximation is not expected to be c^ose, owing to the small 
size of the popui^tlbn and of the sample. Also compute E(x) - 2a j and 
E(x) ♦ 2o- and f init the»W>PO'ftion of the 70 values ^f x that are between 

X ■' 

^ * ^< • 

these two llmitii 
4.3 SAMPLE DESIGN 

There are many methods of designing >and selecting samples and of making 
estimates from samples. Each sampling method and estimator has a sampling 

distribution. Since the sampling distribution is assumed to be normal, 

2 

alternative methods are compared in terms of ECx") and a^^ (or o^J. 

For single random sampling, we have seen, for a sample of n, that 
every possible combination of n elements has an equal chance of being the 
sample selected. Some of these possible cooibinatlons (samples) are m^ch 
bette than others. It is possible to Introduce restrictions In sampling 
so some of the, condblnations cannot occur or so some coebinatlons hfv# a 



hlslMr pcobabllity of occurrence then ocliera. This can be done irithout 
introducing bias in tl» exeinate x' and without losing a basis for esti- 
Mting Discussion of particular sample designs is not a prinmry 

purpws of this chapter. However, a few sinple illustrations will be 
used to introduce the subject of design and to help develop concepts of 
•afl^ling variation. 

Illustration 4.2. Suppose the population of 8 elements used in 
Table 4.1 ia arranged so it consists of four sampling units as follows: 

I Siapling Unit Slenents Values of X Sa»B>le Unit Total 

1 1,2 - 2, - 1 3 

2 3,4 Xj « 6, X^ « 4 10 

3 5,6 X5 - 7, Xg • 8 15 

4 7,8 X^ m n, Xg - 9 20 

For sampling purposes the population now consists of four sampling 
units rather than eight elements. If ws sslect a simple random sample of 
two sampling units from the population of four sampling units, it is clear 
that the sampling theory for sinple random sampling applies. This illus- 
tration points out the importance of making a clear distinction between a 
sampling unit and an element that a measurement pertaina to. A sampling 
unit corresponds to a random selection and it is the variation among sam- 
pling units (random selections) that detetminea the sampling error of an 
estimate. When the SMipling units are composed of more than one el«Mnt, 
the sampling is commonly referred to as cluster ssmpling because the ele- 
ments in a sampling unit are usually close together geographically. 



1^7 



121 

For a aiaple rndoa tMipU of t iMpling unit*, th« VMrlanee of 

vhmtm 1 it tho s«MpU cvorAgft p«r ■anpliag unit» if 
c 

s| " T ^ - 13.17 

C 

wher« 

\ 

iMtMd of th« mrago per tnapXing imit one will piobia>Iy b* iiitftCMt«l 

- *c 

in Cho mrago i^r oleiMrac, which is x - ^ » aince thora at* cifo alaMmU 
in aach aaaiplin|^\niiit. Tha varianca of x ia ona-fourth of tha variaaca 
of x^. Hanca^ tha variaaca of x ia ^^j^ - 3.29. 

Tbara ara oiay aix poaaibla randen aanplaa aa foXlowat 



Sanpla 


• 

SawpXing Coita 


Sanpla avaraga £ar 
aanpXing uait» x 


± 


I 


1,2 


6.5 


24.5 


2 


1.3 


9.0 


72.0 


3 




XX. 5 \; 


X44.S 


4 


2,3 


X2i5 \ 


X2.5 


S 




X5.0 


50,0 


6 


3.4 


17.5 


X2.5 



e 

.2 . ^ . . _ ^1 ' « _ e2< 



,2 , 1 ^ md X. ia a aaapXing tmic totaX. Ba aura to notica 



that (which ia tha aanipXa aatimata of S^) ia tha varianca mmg impXing 
imica in tha aaapXa , not tha variaaca aaong individuaX aXaaanta in tha 

aa^»Xa. Proa tha Xiat of aix aaapXaa, it ia aaay to varify that x^ ia an 

2 

uabiaaad aatiwita of tha popoUtion avaraga par aaapXing unit and that a^ 
ia aa uabiaaad aatiwita of -^1^ , tha varianca aaKmg tha four aanpXing 



122 

*■ 

vnit9 Jin thft population* Also, the variance amn^ the six values of x is 
13,17 which agrees with the formula. 

The 9ix possible cluster samples are among the 70 samples listed in 
Table A.l. Their sample nus^ers in Table 4.1 are 1, 9, 28, 43, 62, and 
70. A "c" follows these sample nundiers. The smiplinft distribution for 
the six samples is shown in Table 4.2 for comparison with simple random 
sampling. It is clear from inspection that random selection from these 
six is less desirable than randwn selection from the 70. For example, 

ne of the two extreme averap.es, 3.25 or 8.75, has a probability of j of 
occurring for the duster sampling and a probability of only ^ when 
selecting a simple random sample of four elements. In this illustration, 
the sa#ling restriction (clustering of elements) increased the sampling 
variance from 1.5 to 3.29. 

It Is of importance to note that the average variance among elements 
within the four clusters is only 1,25. (Students should compute the within 
cluster variances and verify 1.25). This is much liss than 12; 00, the 
variance among the 8 elements of the population. In sDsality, the variance 
among elemencs within clusters is usually less than the variance among all 

eleii»nts in Ithe population, because clusters (sampling units) are asually 

/ 

composed of 'elements that are close together and elements that are close 
together usually show a tendency to be alike. 

Exercise 4.2 . In Illustration 4.2, if the average vari«iee among 
elements within clusters had been greater thai 12.00; the sampling variance 
for cluster sampling would have been less than the sampling variance for a 
simple random sample of elements. Repeat what was done in Illustration 4.2 

0 



123 

uslns as san^linjs units elements 1 and 6, 2 and 5, 3 and 8, and 4 and 7. 

Study the results* 

Illustratio n 4.3 . Perhaps the most itommon n^thod of stapling Is to 
assign sampling units of a population to groups called strata. A simple 
random sample is then selected from each stratum. Suppose the population 
used in Illustration 4.1 is divided into two strata as follows: 

Stratum 1 Xj^ - 2» « 1» « 6, - 4 

Stratum 2 - 7, X^ - 8, X^ - 11, Xg « 9 

The sampling plan is to select a simple random sample of two elements 
from each stratum. There are 36 possible samples of 4, two from each 
stratum. These 36 samples are identified in Table 4.1 by an s after the 
sample nuirf>er so you may con^are the 36 possible stratified random samples 
with the 70 simple random samples and with the six cluster samples. Also, 

see Table 4.2. . 

Consider the variance of x. We can write 
- X +5^2 

where x, is the sample average, for stratum 1 and x, is the average for 

1 

stratum 2. According to Theorem 3.5 
S-x - <i><4/ * 



We know the covarlance, S- - , is sero because the sampling from one 

*l''2 

stratum is independent of the sampling from the other stratum. And, 
since the sample within each stratum is a simple random sample. 



U,-n, S? , ?<^li-^l.>' 



,2 *T"1 *'l u e2 i 

;* m where S, ■ v, — z 

*%. N. n, 1 N,-l 



ERIC 



1 "i "1 * "1 



* 124 

/ 



'2 \ 



Ihe subseripc "I" refers to stratum X. S§ la of the em fom as sl . 
Therefore, 

Since 



— - . « - . and - - 2. 

2 2- 
s2 - i ffilfil . i f 4. 924-2. 92 . . 

The variance, 0.49, Is comparable to 1.5 in Illustration 4.1 and to 3.2$ in 
Illustration 4.2. 

In Illustration 4.2, the saapling units were groups of two elements and 
the variance among these groups (san^ling units) appeared In the formula 
for the variance of x. In Illustration 4.3, each element was a sampling 
unit but the selection process (randomiasation) was restricted to taking 
one stratum (subset) at a time, so the sampling variance was determined' by 
variability within strata. As you study sampling plans, form mental pictures 
of the variation which the sampling error depends on. With experience and 
accumulated icnowledge of what the patterns of variation in various popula- 
tions are like, one can become expert in judging the efficiency of alterna- 
tive sampling plans in relation to specific objectives of a survey. 

If the population and the samples in the above illustrations had been 
larger, the distributions in Table 4.2 would have been approximately nor- 
mal. Thus, since the form of the distribution of an estimate from a prob- 
^ility ssmple survey is accepted as being normal, only two attributes of 
an estimate need to be evaluated, namely its expected value and its 
variance. 



131 



125 



. In tli« «bov« illtwtratioiui ld««l conditions woto illicitly ^twtd. 
Sttdi condltlotto do not exist In th« »•! world to tlio thoory mist bo 
oxtoadMl to fit, «or« Muetly, ncttud coniltlono, Thow m nuMxotw 
•oureti of ortor or variation to bo ovoluntod, Tha nnturo of th« riU- 
tl^hip botwoon tlMwry and practice la a aajor govamlng factor deter- 
«lAlii8 the rate of progreaa toward H^roveawit of the accnracy of anrvey 
reacts. 

We will now extend error concepts toward «ore practical aettlnga. 



4«4 BE^F(»iSE^BRSOR 



So far, we havr dlacuaaed aaapling under ii^Mclt aaai«ptlona that 
Maeureaenta are obtained froa all n elenenta in a aaople and that the 
Masttrevent for each element la without error. Neither aeett>^>tion fita, 
exactly, the real world. In addition, there are "cowage" errore of 
varloua klnda. For exanple, for a farm survey a farm is defined but 
application of the deHinition Involves some degree of ambiguity rfjout 
whether particular enterprises satisfy the definition. Also, two persons 
id^it have an interest in the same farm tract giving rise to the possibility 
that the tract might be counted twice (included es a part of two farms) or 

omitted entirely. 

Partly to emphasise that error in an estimate is more than a matter 
of sampling, statisticians oftsn claaslfy the numerous sourcee of error 
into one of two general classes: (I) Sampling errors which are errors 
associated with the fact that one has measurements for a sample of el«wits 
rather than measurements for all elements in th*» population, and (2) non- 
sampling errors— errors that occur whether ssmpling is Involved or not. 
Mathematical error models can be very complex when they Include a term for 



ERIC ; 132 



• 126 

' e«ch of many sources of error and attenpt to represent exactly the real 
world. However, coiaplieated error swdels are not always necessary, 
depending upon the putlposes. 

For purposes of discussion, two dversinplified response-error models 
will be used. This will introduce the subject of response error and give 
soae clues regarding the nature of the inpact of response error on the 
distribution of an estinate. For simplicity, we will assume that a 
measurement is obtained for each element in a random sample and that no 
ambiguity exists regarding the identitjr or definition of an elotent. Thus, 
we will be considering swnpling error and response error simuluneously. 

lUustration 4.4. Ut T^ be the ''true values" of some variable 

for the N elements of a population. The mention of true values raises 
numerous questions about what is a true valu^;. For exiHsple, what is your 
true weight? How would you define the true weight of an individual? We 
will refrain from discussing the problem of defining true values and simply 
assume that true values do exist according to some practical definition. 
When an attempt is made to ascertain T^, some value other than T^, might 
be obtained. Call the actual value obtained The difference, e^ - 
*i ' ^i' response error for the i'** element. If the characteristic, 

for example, is a person's weight, the observed weight, X^, for the i*** 
individual depends upon when and how the measurement is taken. However, 
for simplicity, assume that X^ is always the value obtained regardless of 
the conditions under which the measurement is taken. In other words, 
assume that the response error, e^, is constant for the i*** element. In 
this hypothetical case, we are actually sampling a population set of values 
X^M..,Xy instead of a set of true values Tj,...,Tjj. 



127 



Onder the ccmdlcloiit at itatftd, the eflS^Xlng theory «|>i>lies exactly 
to the set of popuUtlon veluee X^,...,Xj,. If « »l«ple randoia seapU of 
eienento l» eelected end weMurenente for all eleneiiti in the »«i8ple ere 

Stained, then E(x) - X. That Is, If the pur^se la to eatlnate f - » 
the eetinate la biased ualeas f happens to be equal to I. The blaa la 
X - f which la appropriately called "response bias." 
Rewrite \ " \ " ^ follows: 

^1 " ^1 *1 



Then, the wean of a simple random sanple may be expressed as 

n n 
Ix. 
X " —• ■ 



E(t^+e^) 



or, as X t e . 

From the theory of expected values, wte have 

E(x) - E(t) -¥ E(e) 
Since E(x) - X and E(t) - f It followa that 

S « f + E(e) 



Se. 



Thus, X Is a blaaed estimate of f unless E<e)- 0, where E(e) « . 
That Is, E(3) Is the average of the response errors, e^^, for the whole 
population* 

For simple random sampling the variance of x la 



E(X^-S)^ 



s2 

2 N-n X «2 1 



Bow dee* the respoiwe arcor affect tM variaaee of X and of x1 W« bava 



already written the obaerved value for the 1 element as being equal to 



ERIC 



128 

its Kruo value plus a response error i thst is, « •Ke^. Assuadng 
r«ido« ssapXingt T^ 'and e^ are random variables. We can use Theorem 3.5 
from Chapter lit and i^rite 

- 4 + ♦ Ml., (4.3) 

2 2 9 

Where S^^ is the variance of X, S* is the variance of T, is the response 
variance (that is, the variance of e), mid S^^^ is the covariance of T and 
a* The tetma on the right-hand side of Equation (4.3) cannot be evaluated 
unless data on X^ and T^ are available; however, the equation does show how 
the response error influences the variance of X and hence of x. 

As a numerical example, assume a population of five el«aente and thf 
following values for T and Xt 



A. 


A. 


"i 


23 


26 


3 


13 


12 


-I 


17 


23 




25 


25 


0 


7 


^, , 9 ,. 


2 


17 


19 


2 



, Average 

Students may wish to verify the following results, especially the variance 
of e and the covariance of T and e: 

- 62.5 sj - 54.0 sl " 7.5 S- « 0.5 

As a verification of Equation (4.3) we have 
62.5 • 54.0 + 7.5 + (2)(0,5) 



^ ICiS 



129 



" - 2 

2 _ i. 



Prom d«t« In a slwple sample one would «>Bput« »j ■ , 

and ua« ^ ^ a» an Mtla«t« of the vact^nca o£ x. la It ciaar that 

a^ la an unbiaaad eatlwate of sj rather than of and that the Impact of 

V 2 * ■' 

variation in #^ la included 

TO autaBarlie, reaponae error cayaed a biaa in x aa an eatlaate of T 
that waa equ«l to X - T. In addition, it waa a aource of variation Indnded 
in the atandard error of x. To evaluate biaa and variance attributable to 
reaponae error, information on and miat be available. 

Illnatration 4>S . Ift thia caae, ra aaawae that the reaponae eirror 
for a giwn element la not^^tant. That ia, if «i element were miaaured 
on several occaaiona, the obaerved valuea for the i*** element could vary 
even though the true value> T^, remained unchanged, let the error model be 

where X^j ia the obaerved value of X for the i' element when the 
observation ia taken on a particular occasion, J , 
T^ ia the true value of X for the i'^ element, 

and ejj ia the reaponae error for the i element on a particular 

occasion, 

Aaauae, for any given element,, that the reaponae error, ej^, ia a random 
variable. We can let ej^ - e^ + e^^, where e^ ia the average value of e^^ 
tor a fixed I, that ia, « BCe'^ll). Thia divides the reaponae error 
for the i*** element into two components s a eonatant component, e^, and a 
variirble component, e^^. »y definition, the expected value of a^^ ia aero 
for any given elegant. That ia, B(e^^|i) " 0. 

116 



X30 

Subscicuting -¥ e^^ for , the nodel becoses 

The modeX, Equation (4. A), is now in a ^ood fona for comparison with 
the model in Illustration 4.4. In Equation (4.4), e, , like e^ in 

*■ i 

Equation (4.2) is constant for a given element. Thus, the two models 
are alike except for the added term, e^^, in Equation (4.4) which allows 
for the possibility that the response error for the i*** element might not 
be constmt. 

Assume a simple random sample of n elonents wad one observation for 
each element. According to the model. Equation (4.4), we may uow write 
the sample mean as follows: 

X • — + i — ^ i 

n n n 

Summation with respect to J is not needed as there is only one observation 
for each element in the sample. Dnder the conditions specified the expected 
value of X nay be expressed as foll»^s: 
E(x) - f + e 

. fi . 
where f •=5- and e ■ Ar- 

The variance of x is complicated unless some further assumptions are 
*made. Assume that all covariance terms are zero. Also, assume that the 
conditional variance of e^^ is constant for all values of 1; that is, let 
V(e^^fi) • S^, Then, the variance of x is 

S^.Nin fl^N^ - 
X N n N n n 



er|c "j 



ERIC 



X31 

2 1 2 1 

and 8^ Is the conditional variance of e^^, that Is, V<e^^(l). For this 
nodal the variance of x does not diminish to zero as n^:i« Uowever, assuming 

N Is large, the variance of x, which becomes jj- , Is probably negligible. 

Definition -4.2 . >tean-S<tuare Error . In terns of th^ theory of expected 
values the mean-square error of an estimate, x", is ECx'-i)"^ where T is the 
target value, that is, the value being estimated* From the theory it Is 
easy to show that 

E(x--T)^ - lE(x')-Tl^ + Elx''-E<x')l^ 
Thus, the mean-square error, mse, can be expressed as follows: 

mse - B + 0^, <**5) 

where B - E(xO - T 

,and oj. - Elx'-E(x')l^ (^-7) 

Definition 4.3 . Bias . In Equation (4.5), B is the bias in x' as 
an estimate of T. 

Definition 4.4 . Precision . The precision of an estimate is the 
standard error of the estimate, namely, o - in Equation (4.7). 

Precision is a measure of repeatability. Conceptually, it is a 
measure of the dispersion of estimates that would be generated by repetition 
of the same sampling and estimation procedures many tin»s under the same 
conditions. With reference to the sampling distribution, it Is a measure 
of the dispersion of the ejtiraates from the center of the distribution and 

9^. f/H 



132 

doM not include any indie«Clon of where the center of the dlatribtttion 
ie in reUtiott to • target. 

In IXlustretione 4.1, 4.2, end 4.3, the target value was iaq»Iicitly 
assumed to ^ that; is, T was equal to X. Therefore, B was sero and 
the Mean-square error of x" was the sane as the variance of x". In 
XUustrations 4.4 and 4.5 the picture waa broadened somewhat by intro- 
ducing resfKmse error and examining, theoretically, the impact of response 
'xrror on B(x') and 9^^. In practice many factors have potential for 
influencing the sampling distribution of That is, the data in a 
sample are subject to error that might be attributed to several sources. 

From sample data an estimate, is computed and an estimate of the 
variance of x' is also computed. How does one interpret the results? In 
Illustrations 4.4 and 4.5 we found that raapoose error could be divided 
into bias and variance. The error from any source can, at least concep- 
tually, be divided into bias and variance. An estimate from a sample is 
subject to the combined influence of bias and variance corresponding to 
each of the several sources of error. When an estimate of the variance 
of is computed from sample data, the estimate is a combination of 
variances that might be identified with various sources. Likewise the 
difference between E(x') end T is a cosibittation of bi ises that might be 
identified with various sources. 

Figure 4.2 illustrates the settling distribution of x' for four 
different cases: A, no bias and low standard error; B, no bias and large 
standard error; C, large bias and low standard error; and D, large bias 
and large standard error. The accuracy of an estimator is sometimes defined 
as the square root of the mean-square error of the estimator. According 

ERIC 139 





t I 1 



r 

T E(x') 
C: Large bia«'*''Xow scandard error 



"J j I I 

T E(x') 
D: Large bias— large standard error 



Figure A . 2— Exanples of four sampling distributions 




Figure 4.3— Sampling distribution- 
Each snail dot corresponds to an estimate 



140 



ft 



134\ 

to tlimt definltloat we could describo estimators having the four eampling^ 
dietributions in Figure 4.2 as follows: In ease A the estimator is precise 
end accurate; in B the estimator lacks precision and is therefore inaccurate 
in C the estimator is precise but ihaceurate because of bias, and in Dthe 
estimator is inaccurate because of bias and low precision. 

Unfortunately, it is generally not possible to determine, exactly, 
the magnitude of bias in an estimate, or of a particular component of bias. 
However, evidence of the magnitude of bias is often available from general 
experience, from knowledge of how well the survey processes were performed, 
and frcRB special investigations. The author accepts a point of view that 
the nean-square error Is an appropriate concept of accuracy to follow. In 
that context, the concern becomes a matter of the magnitude of the mse and 
the else of B relative to o^. That viewpoint is important because It is 
not possible to be certa^ln that B is zeio. Our goal should be to prepare 
survey specifications and\to conduct survey operations so B is small in 
relation to o^. Or, onf might say we want the mse to be minimum for a 
given cost of doing the survey. Ways of getting evidence on the magnitude 
of bias is a major subject and is outside the scope of this publication. 

As indicated in the previous paragraph, it is important to know some- 
thing about the magnitude of the bias, B, relative to the standard error, 
o^«. The standard error is controlled primarily by the design of a sa8q>le 
and its else. For many survey populations; as the size of the senile 
increases, the standard error becomes small relative to the bias. In fact, 
the bias might be larger than the standard error even for samples of 
moderate size, for example a few. hundred cases, depending upon the circum- 
stances. The point is that if the mean-square error is to be small, both 



135 

B tmd c^. must taall. The approaehe* for reducing B are very different 
froa the approaches for reducing o^«. The greater concern about non- 
ssa^ling error is bias rather than impact on variance. In the design and 
selection of samples and in the processes of doing the survey an effoift is 
made to prevent biases that are "sampling" in origin. Hoiievert in survey 
work one must be constantly aware of potential biases and on the alert to 
minimise biases as well as random error (that is, o^^)* 

The above discussion puts a census in tlm same light as a sample. 
Results from both have a mean-square error. Both- are surveys with refer* 
ence to use of results. Uncertain inferences are involved in the use, of 
results from a census as well as from a scoqjle. The only difference is 
chat in a census one attempts to get a meaa^rement for all N elements, 
but making n ** N does not reduce the mse to i^ero. Indeed, as the sample 
sise increases, there is no positive assuriMtct^ that the mse will always 
decrease; because, as the variance component of the mse decre^es, the 
bias component might increase. This can occur especially when the popu- 
lation is large and items on tiie questionnaire are such that simple, 
accurate answers are difficult to obtain. For a large sample or a census, 
compared to a small sample, it might be more difficult to control factors 
that cause bias. Thus, it is possible for a census to be less accurate 
(have a larger mse) than a sample wherein the sources of error are more 
adequately controlled. Much depends upon the kind of information being 
collected. 

4.S BIAS AND STANDARD ERROR 

The words "bias," "biased," and "unbiased" have a wide variety of 
meaning among various individuals. As a result, much confusion exists, 

142 



136 

«sp«ci«Xly since ch« t«t»i are often u«e4 iMsely. Techniodly* it mnhm 
logical to define the bias in an estimate as being equal to B in Equation 
^ (4«6)» which is the difference betwMm tiie ei^ected value of an estinate 
and the target value* But, except for hjrpothetical cases* nioMrieal values 
do not exist for either E(x') or the target T. Hence, defining an unbiased 
estimate as one where B « EU") - T *■ 0 is of little, if any, practical 
value unless one^ is willing to accept the target as being equal to E(xO* 
Fron a saapling point of view there are conditions that give a rational 
basis for accepting S(x') as the target. However, regardless of how the 
target is defined, a good practical interpretation of E(x') is needed. 

It h«i becmae conaon practice asMmg survey statlsticiaiui to call an 
estimate unbiased when it is based on meth'^ds of sampling and estiauition 
that are "unbiased." For example, in Illustration 4.4, x would be referred 
to as an unbiased es tlmate>— unbiased because the method of sampling and 
estimation was unbiased. In other words, since x was an unbiased estinata 
of X, X could be interpreted as an tmbiased estimate of the result that 
would have been obtained if all elments in the population had been 
measured. 

In Illustration 4.5 the expected value of x is more difficult to 
describe* Nevertheless, with reference to the method of SMpllng and 
estimation, x was "unbiased" and could be called an unbiased estimate 
even though B(x) is not equal to f . 

The point is that a simple stateawnt which says, "the estimate is 
unbiased" is incomplete and can be very misleading, especially if one is 
not familiar with the context and concepts of bias. Calling an estimate 
unbiased is equivalent to saying the estimate is an unbiased estimate of 

ERIC ' . 



137 

ifes «sp«et«d valiM. ftegardUM of how **biM" if defined or used, E(xO 
ie the aeaa of the sanpling dietribucion of x; and this eoneept of BCx") 
it 'mtf iiqportaat heeetiee BCx") eppemrs in the eteadard erTor» o^«, of x" 
M weXl M in B. See Bquetions (4*6) §xid (4*7). 

Ae e eiisple conci^fc or picture of tlw error of m estinete from e 
evcvey, the yriter likee the enelogy between en eetiiute end e shot et 
#■ e target with e gun or en arrow. Think oif a survey being replicated 

aaay tiaes using the sane sanpling plan, but a different sample for each 
r^licetion. Each replication would provide an estimate that corresponds 
to a shot at a ti.rget. 

In Figure 4.3, each dot corresponds to an estimate from one of the 
replicated samples. The center of the cluster of dots is labeled BCx") 
becauee it corresponds to the expected value of an estimate. Around the 
point BCx") a circle is drawn which contains two-thirds of the points. 
' The radius of this circle corresponds to o^^» the standard error of the 
estimate. The outer circle has a radius of two standard errors and eon- 
tains 95 percent of the points. The target is labsled T. The distance 
^ between T and B(x') is bias, which in the figure is greater than the 
etaadard error. 

'i In practice, we usually have only one estimate, x', and an estimate, 

s^^, of the standard error of x". Vith reference to Figure 4.3, thie 
means one point and an eetimate of the radius of the circle around E(x') 
that would contain two-thirds of the estimates in repeated samplings. We 
do not know the value of E(x'); that is, we do not know where the center 
of the circles is. However, when we make e statement about the standard 
error of x", we are expressing a degree of confidence about how close a 

er|c 144 



138 

parftl^ar eatlonu prepared from a survey is to E(x'); that is, how 
eXosa roe of the points in Figure A. 3 probably is to the unknown point 
\ E(xO* A judgment as to how far E(x') is from T is a matter of how T 
is defined and assessment o* the magnitude of biases associated with 
^jBfiou3 sources of error. 
/'/ Unfortunately, it is not easy to make a short, rigorous, and complete 
) interpi^tative statement about the standard error of x". If the estimated 
standard error of x" is three percent, one could simply state that fact 

and not make an interpretation. It does not help much to say, for ewunple, 

If 

that the odds are about two out of three that the estimate is within three 
percMt of its expected value, because a person familiar with the concepts 
already unde»tands that and it probjri>ly does not help the person who is 
unfamiliar with the concepts. Suppose one states, "the standard error of 
x' means the odds are two out of three that the estimate is within three 
percent of the value that would have been obtained from a census taken 
under identically the sane conditions." That is a good type of statement 
to make but, when one engageu considerations of the finer points, 
interpretation of "a census taken under identically the same conditions" 
is needed-'>especially since it is not possible to take a census under 
identically the saom conditions. 

In summary, think of a survey as a fully Refined system or process 
Including all details that could affect an estimate, including: the method 
of sampling; the method of estimation; the wording of questions; the order 
of the questions on the questionnaire; interviewing procedures; selection, 
training, and supervision of interviewers; and editing and processing of 

145 

ERIC 

■ 



\ ■ 

139 

dAt«. Conceptually, the stapling uXthen replicated many tinea, holding 
sll apeelflcatlons «&d eondltiona conatont. Thla would generate a 8ai»- 
pUng dlatrlbutloa aa llluatrated In Flguna 4*2 or 4*3. We n^d to 
jceco^lze that a change In any of the aurveV specifications or conditions, 
regardless of hov trivial the change might ae^, has a potential for 
changing the saa^ling distribution, eapeclally the expected value of x'* 
Changes in survey plans , even though the definition of the parameters 
being estimated remains unchanged, often result in discrepancies that 
are larger than the random error that can be attributed to sampling. 

The points discussed in the latter part of this chapter were Included 
to emphasise that much more than a well designed sample is required to 
assure accurate results. Good survey planning Mid management calls for 
evaluation of errors from all sources and for trying to balance the effort 
to control error from various sources so the mean-square error will be 
within acceptable limits as economically as possible. 



ERIC 



